[Excerpt] large sites and architectural evolution of knowledge

The first step in the evolution of architecture: the physical separation webserver and database
Initially, some thought, then set up a website on the Internet, this time there may even hosts are leased, but since this article we focus on the evolution of architecture history, and therefore assume that the time is already hosting a host, and there is a certain bandwidth, this time because the site has a certain characteristic, attracted some people to visit, gradually you find more and more high pressure system response faster and slower, but this time is more obvious influence each database and application, application problems, the database is also very prone to problems, when something goes wrong and the database, applications are vulnerable, then enter the first step stages of evolution: the separation of application and database physically, into two machines, this time there is nothing technically new requirements, but you find it to play the effect, the system has returned to its previous speed of response and support live a higher flow rate, and will not affect the formation of each other because the database and applications.
Look at this one step illustration of the system: This step involves these knowledge systems: This step is the evolution of the architecture of the basic technical knowledge is not required. Schema evolution Step two: increase the page cache did not last long, as more and more people visit, you find that the response speed began to slow down, and find out the reasons and found that the operation is too much access to the database, resulting in fierce competition data connection so slow to respond, but the database connection can not open too much, otherwise the database machine pressure will be high, so consider using caching mechanisms to reduce competition and pressure on the database to read the database connection resources, the first time may choose to use squid and other similar mechanisms to the system in a relatively static pages (for example, twelve days will have an updated page) cache (of course, can also be used to static page program), can not be modified on this program, you will be able to good reduce the pressure on the webserver competition and reduce database connection resources, OK, then began to use the cache squid do relatively static pages. Look at this one step illustration of the system: This step involves these knowledge systems:








Front page caching techniques, such as squid, If you want to use good words have a deep understanding and implementation of the cache invalidation algorithm squid and so on.
Architecture evolution The third step: increasing the page fragment caching
increases the squid cache after do, the overall speed of the system is indeed improved, the pressure webserver began to decline, but with the increase in traffic, and found that the system began to change somewhat slow the after taste of the benefits of dynamic cache like squid brought, began to think now that can make dynamic pages in a relatively static part of it is also cached, so consider using ESI similar page fragment caching strategy like, OK , using ESI began to do dynamic page cache relatively static portion of the segment.
Look at this one step illustration of the system: This step involves these knowledge systems: page fragment caching techniques such as ESI, etc., would like to use good words also need to master ESI implementations such as; architecture evolved Step four: data cache in with ESI technologies like once again to improve the performance of the system cache, the system pressure is indeed further reduced, but again, with the increase in traffic, the system began to slow down or, through the search, you may find there are some duplicate system access to data information in places like obtaining user information, this time to start thinking about is not it possible these data are also cached it, so these data cached in the local memory, change has been completed, in full compliance with speed of response expected, the system again restored, renewed pressure on the database also reduced a lot. Look at this one step illustration of the system: This step involves these knowledge systems: caching technologies, including Map data structures, caching algorithms, the selected frame itself as the realization mechanism. Schema evolution Step Five: Increase webserver










Did not last long, as once again found to increase the amount of system access, pressure webserver machine at the peak of the rise to relatively high, this time to start thinking about adding a webserver, which is to simultaneously solve the availability problem, avoid the single webserver when the machine down, then you can not use, and after doing these considerations, decided to add a webserver, adding a webserver, will encounter some problems, typical are:
1, how to get access to distribution on both machines program at this time is usually considered is Apache native load-balancing scheme, or LVS such software load balancing solutions;
2, how to maintain synchronization, such as user session status information, etc., this program will be considered when there is written database, written to the memory, cookie or session synchronization information mechanism;
3, how to synchronize data cache information, such as previously cached user data, the mechanism at this time is usually considered include cache synchronization or distributed cache;
4, how to upload files to make them similar functions continue normal, this mechanism will generally consider that when Use and other shared file systems or storage;
in solving these problems, finally the webserver is increased to two, and finally the system is restored to the previous speed.
Look illustration of a system after the completion of this step: This step involves these knowledge systems: load balancing technology (including, but not limited to, hardware load balancing, software load balancing, load algorithms, linux-forward protocol, the implementation details of the selected technologies, etc. ), standby techniques (including but not limited to ARP spoofing, linux heart-beat, etc.), status information or cache synchronization techniques (including but not limited to Cookie technology, implementation details UDP protocol, status information broadcast, the chosen cache synchronization technology etc.), file sharing techniques (including but not limited to NFS, etc.), storage technologies (including but not limited to a storage device, etc.). Schema evolution Step Six: sub-libraries




After enjoying the happiness of system traffic period of rapid growth, we found that the system began to slow down, and this is what the situation then, after a search, found a database write, resource competition part of the database update operations of these connections are very intense, resulting in system slows down, which under the how to do it, then there is an optional program and database cluster sub-library policy, some aspects, such as the cluster database support is not very good, so the sub-libraries will become more common strategy, sub-library means to carry out modifications to the original program, through a modified achieve sub-libraries, yes, the goal was achieved, system recovery even faster than before.
Look at this one step illustration of the system: This step involves these knowledge systems: This step is the need to do more reasonable division of the business, in order to achieve sub-libraries, no other requirements specific technical details; but at the same time as the amount of data increases and sub-libraries, database design, tuning and maintenance need to do better on, so the technology is still in these areas makes high requirements. Schema evolution Step Seven: sub-table, DAL and distributed caching With the continuous operation of the system, the amount of data began to significantly increase, this time after the discovery of sub-database query will still be a little slow, so in accordance with sub-library started thinking points table the work, of course, this will inevitably require some modifications to the program, perhaps at this time you will find that they have to care about rules apply sub-library sub-table is a little complicated, so the initiation of a common framework could increase after sub-library to implement data access points table, the corresponding on ebay architecture is DAL, the evolution of this process takes a relatively long time, of course, also possible that this general framework will wait until the sub-table finish began to do, but at this stage might find the cache synchronization scheme before problems arise, because the data is too big, there will be less likely to cause local cache now and then a synchronized manner, we need a distributed caching scheme, so , is a through investigation and torture, finally, to transfer large amounts of data on a distributed cache to cache. Look at this one step illustration of the system: This step involves these knowledge systems:









More of the same sub-table is divided on the service, there is a dynamic hash algorithm is technically involved, consistent hash algorithm;
the DAL involves more complex techniques, such as database connection management (timeout exception), database operations control (time-out, exceptions), sub-library sub-table rule packaging, etc.;
architecture evolution eighth step: adding more webserver
after sub-library sub-table finish this work, the pressure on the database has dropped to relatively low, began looking at the traffic surge to live a happy life every day, suddenly one day, found that access to the system began to slow the trend, first check the database at this time, everything is normal pressure, after reviewing webserver, apache obstruction found a lot of requests, and the application server for each request is relatively fast, it appears that the number of requests is too high, resulting in the need waiting, slow response times, which is also easy to handle, in general, this time there will be some money , then add some webserver server, add the webserver server in this process, there may be several challenges:
1, the Apache Soft load or LVS soft load, etc. can not afford the huge web traffic (request the number of connections, network traffic, etc.) are scheduled at this time if funding allows, the program will take is to purchase hardware load, such as F5, Netsclar, Athelon and the like, such as the funding does not allow it, the program will take to do is to apply some logic from the classification, then dispersed to different soft load the cluster;
2, some of the original state information synchronization, file sharing and other programs possible there will be a bottleneck, needs to be improved, maybe this time will be prepared in accordance with distributed file system website and other business needs according to the situation;
in the finished work, the beginning of a seemingly endless stretch of perfect age, when the increase website traffic cope solution is continuously added webserver.
Look at this one step illustration of the system: This step involves these knowledge systems:


At this step, along with the growing number of machines, the growing volume of data and system availability requirements of increasingly high demand for this time of the technology used should have a deeper understanding, and need to site We need to do more customized nature of the product.
Schema evolution Step 9: separate read and write data and low-cost storage solution
suddenly one day, have found the perfect era is over, the nightmare of another database in the eyes, because too many webserver added, resulting in the database connection resource is not enough, but this time they have been sub-library sub-table, and began to analyze pressure condition database, you may find the database to read and write ratio is high, this time usually think of reading and writing data separation scheme, of course, this program not easy to achieve, in addition, you may find that some waste some of the data stored on the database, or database resources are too occupied, so the evolution of architecture in this stage is to achieve may form separate read and write data, while writing some of the cheaper storage solution, e.g. BigTable this.
Look at this diagram of a system after completion of step: This step involves these knowledge systems: data separation requirements to read and write copy of the database, standby and other strategies have in-depth grasp and understand, and it will have the technical requirements of self-fulfilling; cheap storage program requires a thorough grasp and understanding of the OS file storage, as well as requirements for the use of language to grasp the depth of the realization of this document. Schema evolution Step 10: enter the era of large-scale distributed applications and low-cost server farms dream era





After the above this long and painful process, and finally once again usher in the perfect era, continue to increase webserver can support the increasingly high number of visits, and for large sites, important no doubt popular, with the popularity the increasingly high demand for a variety of functions have begun to explosive growth, this time suddenly discovered that web applications deployed on a webserver has been very large, and when multiple teams are beginning to change in its , is really quite inconvenient, reusability is quite bad, basically every team has done more or less repetitive things, but also to deploy and maintain is quite troublesome, because of the huge package in N application on the machine copy, start need to spend a lot of time, when something goes wrong is not very good investigation, another even worse situation is likely to appear on an application bug will lead to a full stop are not available, as well as others, such as tuning operation is not good (because what applications are deployed on machines to do, simply can not be targeted tuning) and other factors, according to this Analysis, make a determined effort began, the system will be split under a duty, so a large-scale distributed applications was born, in general, this step takes quite a long time, as it will encounter a lot of challenges:
1, split into distribution We need to provide a high-performance, stable communication after the frame type, and the need to support multiple different communication and remote invocation;
2, a large application split takes a long time, the need for finishing operations and systems rely ; controlling relationship and so
good this huge distributed applications 3. He Yunwei (dependency management, health management, bug tracking, tuning, monitoring and alarm).
After this step, the system architecture is almost enter a relatively stable stage, but also to start with a large number of low-cost machines to support the huge amount of data traffic and, with this architecture, and so experience the process of evolution learned many times to use other a variety of ways to support the increasing traffic.
Look at this one step illustration of the system: This step involves these knowledge systems:


This step is very much involved in knowledge, the requirements for communications, remote calls, messaging, etc. have in-depth understanding and grasp of the requirements are in theory, the hardware level, and the achievement of operating system-level languages used are clearly understanding.
Also very much involved in this operation and maintenance of knowledge, in most cases need to have distributed parallel computing, reporting, monitoring technology and rule policies and so on.
Speaking really not very laborious, classic entire site architecture and evolution are relatively similar to the above, of course, plan each step taken, the steps are likely to have a different evolution. In addition, due to the different sites of business, there will be different professional demand technology, this blog more of a process from an architectural point of view to explain evolution, of course, of which there are many techniques have not mentioned here, such as database clustering, data mining, search and so on, but in the real evolution will enhance the image by means of hardware configuration, network environment, the transformation of the operating system, CDN mirroring to support greater traffic, so there will be a lot different in a real process of development, in addition to a large site to do not only is far above these, as well as safety, operation and maintenance, operations, service, storage, etc., to do a large site is really very easy to write this article is more hope to lead to more large sites architecture evolved introduction :).

Reproduced in: https: //www.cnblogs.com/iksharp/archive/2009/03/04/1402893.html

Guess you like

Origin blog.csdn.net/weixin_34198881/article/details/93730366