Comic | This damn distributed!

The company where Zhang Dapang works has developed quite well in recent years, with a surge in business and a rapid expansion of personnel.

Zhang Dapang is also determined to forge ahead and sits on the "throne" of the architect.


With the development of technology, the company's IT system has long shifted from a stand-alone machine to a distributed system, which has brought huge challenges.

When I go to work on Monday, there is an endless stream of people looking for Zhang Dafang.

Zhang Dapang can understand what's going on by looking at this picture. In order to support high concurrency, OrderService has been deployed 4 copies, and each client saves a list of service providers, but this list is static (in the configuration file Written dead in)!

If the machine where url_3 is located is down, or a new url_5 is added, the client does not know it at all, and may still try those instances that have been broken stupidly!

Zhang Dapang thought that there should be a registry. First, name these services (for example, orderService), and secondly, all OrderServices can be registered here.

I don't know if it was a subconscious behavior, Zhang Dafang designed the data structure of this registry into a tree structure.

Of course, it is necessary to directly establish a Session in the registry and each service instance, so that each service instance sends heartbeats periodically. If the heartbeat is not received after a certain time, it is considered that the service instance is down and the Session expires, and it is removed from the tree structure. If you delete it, the client will not find it.

Xiao Liang left with his front foot, and Xiao Wang arrived in a hurry.

In order to be highly available, three batch jobs are deployed on three machines, but only one can run at the same time!

If one of them is down unfortunately, the remaining two need to be elected, and the batch job selected needs to "inherit the legacy" and continue to work. 

Xiao Wang is very smart, and he immediately understood what was going on.

When the current Master machine hangs up, you need to delete /master. 

Obviously, the registry also needs to communicate with each machine to see if they are alive. 

There is another complicated situation here. If machine 1 does not die, it just fails to connect to the registration center for a long time.

In fact, if machine 1 cannot contact the registration center, it needs to stop Batch Job, otherwise it may conflict with others.

After connecting to the registration center again, I know that I am no longer the master, and honestly wait for the next opportunity.

As soon as Xiao Wang left, Xiao Cai came in immediately.

Zhang Dapang thought that this is the case. It seems that this problem is underestimated. Distributed locks are not Master elections, and fairness must be considered.

System 1 holds the lock and can operate on shared resources. After the operation is completed, the node process_01 is deleted, and a new node is created (the number becomes process_04)

This cycle continues... Isn't the distributed lock realized?

Zhang Dapang decided to report to CTO Bill and organize manpower development.

Unexpectedly, Bill hit the nail on the head and pointed out a major flaw all at once.

This registry must also have multiple machines to ensure high availability!

The original problems of Xiao Wang, Xiao Liang, and Xiao Cai have not been resolved, and the registration center alone is going to die. With our own company's technical strength, it is simply impossible to create such a registration center!

Zookeeper has these core concepts:

1. Session: Represents a connection session between a client system (such as Batch Job) and ZooKeeper. After the Batch Job is connected to ZooKeeper, it will periodically send heartbeat information. If Zookeeper does not receive the heartbeat within a certain period of time, it will consider it The Batch Job has died, and the Session will end.

2. znode: Each node in the tree structure is called znode. According to the type, it can be divided into permanent znode (it will always exist unless it is deleted actively), temporary znode (it will be deleted at the end of the session) and sequential znode (which is Xiaocai Process_01, process_02...) in the distributed lock.

3. Watch: A client system (such as Batch Job) can monitor znodes, and znode changes (delete, modify data, etc.) can notify Batch Job so that Batch Job can take corresponding actions, such as rushing to create nodes .

More exciting comics, all in the code farmer stand up!

Guess you like

Origin blog.csdn.net/coderising/article/details/114826693