AWS cable after digging a little summary of the architectural design (b)

No public: large sub-trotters programmer

Last article we summed up the city-active, live in different places, and some three centers in two deployment architecture, then this article I will express my five centers three understandable.

Our last article talked about two to three centers this architecture, as shown below:

image.png

This architecture enables disaster recovery capabilities, such as the production data center power outage, you can put all traffic to cut the city off-site disaster recovery center or disaster recovery center, so the question now is, if true to the power of the day, you dare put all traffic is cut to the disaster recovery center to go?

The article said, the disaster recovery center as its main function is to produce a backup data center, so it does not like the production data center in the same non-stop to provide services, then there is a problem, the equivalent of a disaster recovery new, a new imitation has been a big brother, big brother injured now stands to reason that should be on top of the younger brother, but the brother was a new, hard top up is not likely to be wrong? In other words:

  • First, you can not guarantee disaster recovery center has the ability to take over all of the online user traffic, may have just been overwhelmed by the disaster recovery center has been received, or other sorts of errors less than estimated;
  • Second, if the production data center receives a write request user, not enough time to synchronize the disaster recovery center, the blackout time, in this case, not directly to the user traffic is switched to the disaster recovery center.

So based on the above analysis, not to say disaster recovery center must not be on top, just before the top of the other may have to do a lot of checking and preparation, this time is uncertain, at least not soon ....

So the question is, how to do?

For the first listed above, the first point to the two points, and if we can make disaster recovery center can no longer just used for disaster recovery, but also the production data center and provide services as normal it? As shown below:

image.png

You can see above architecture diagram:

  • Production no longer distinguish the data center and data center disaster recovery, only the data center, and another backup data between the data centers, each data center is to ensure that the total amount of data.
  • The user can read and write operations on any one data center.

Well, first of all we do not care of this architecture can not be achieved, at least its benefits are very obvious:

  1. Each data center has been in foreign service (not a novice), so when a data center outage, cut directly to the user traffic to another data center is not a problem.
  2. Users can access the nearest data center, so the user experience better, and the whole structure of the flow is relatively average.

Advantage Obviously, if we can achieve all the better, then this architecture to implement the most important point is this: the user to simultaneously write data to a different data center, data center two-way data synchronization, if conflict arises what is the solution?

So the question now Ali and gold suit ant solution is: will users be grouped according to certain rules, can only be written to the specified data center when each group of users to write data, the equivalent of the user and the data center binding together, thus ensuring that the data center before the two-way data synchronization is not conflict, because according to user groups, and different users' data will not conflict.

Of course, the idea is very simple, but to achieve it is certainly very troublesome, but the idea is certainly possible, with Ali also proved, we first look to improve a little above architecture:

image.png

Users use the site, by the operator's network or CDN selects the nearest room to deploy a load balancing engine room from the load balancing ultimately determine whether the user belongs to the room (data center earlier in), which is likely to occur, the user at the time of registration Beijing, his uid and Beijing on a room bound, then when the user in Shanghai, the Shanghai will load balancing in accordance with user grouping rule forwards the request to the engine room Beijing is bound to (of course not All requests, such as read requests can certainly be read directly in the Shanghai room, so we must look at how to achieve specific business achieved, as well as the specific load-balancing configuration, here is the most straightforward way of thinking say something).

So, we can now:

  • Assuming that Beijing room 1 application or database corresponding machine power went out, so we can adjust the load balancing is originally belong to user traffic this room transferred to the engine room 2 go, attention here not in doubt, the user group is to prevent users to simultaneously write two databases conflict, so now is already in the engine room 1 can not write data, so traffic will migrate to the engine room 2 is not a problem.
  • Beijing assumed that the entire room 1 power outage, you can migrate traffic to Beijing room 2 through the carrier network or CDN.
  • Assuming that Beijing blackout, then the same can migrate traffic to Shanghai.

This architecture is actually the most important user groups , so including our applications, databases, load balancing, database tables, and so need to be sub-grouped by users, we want to ensure that a request for the same user with the operation are in the same room within, not across the room, this is the fastest, which is a unit .

So above this architecture it is actually a "two to three centers" Advanced Edition, but this unit architecture we can go to any extension (if you have money enough, because building a fully configurable data center is needed costly a), for example, you add a data center in Shanghai, Hangzhou, also increased one, then the following figure:

image.png

This is called three five centers .

Plain paper market about three to five centers much, I hope this article can give you help, of course, I also made reference to other articles have their own understanding, if the wrong place welcome to correct me.

I believe we do not like in a small cell phone screen also see a chunk of code reading experience, so I'm writing style of the text will be above normal point. If you give a little thought to gain something like it.

No public: large sub-trotters programmer

reference:

Guess you like

Origin juejin.im/post/5cf627e55188254c5726a725