Large sites and architectural patterns core principle Sina microblogging case analysis

What is the pattern? Each model describes the core of a solution to the problems continue to occur around us and the problem. In this way, you can use the program again and again without having to reinvent the wheel.

Maybe not just copy Internet products can be successful, innovative products but also create value for customers. But the website architecture there are some common patterns that many large sites have been verified again and again, through the study of these patterns, we can grasp the general idea of ​​large sites and architectural solutions, to guide our architecture design.

1. site architecture model

In order to solve a series of problems faced by high concurrent access to large sites, mass data processing, highly reliable operation and so on and challenges, large Internet company made in practice a number of solutions in order to achieve high performance sites, highly available, and easy scalable, extensible , security and other technical architecture goals. These solutions and more sites are reused, thus gradually forming a large site architecture model.

1.1 stratified

Layering is the enterprise application system is the most common form of architectural pattern, the system will be cut in the transverse dimension is divided into several sections, each responsible for a portion of the relatively single responsibility, then through the upper and calls for lower dependence of the composition of complete system.
A layered structure in the computer world everywhere, 7 network communication protocol is a layered structure; computer hardware, operating system, application software can be seen as a hierarchical structure. In large site architecture is also employed hierarchical structure, the site software system is divided into an application layer, the service layer, the data layer, as shown below.

Application layer And is responsible for specific business view shows such as Home and enter and search results show
Service Layer Provide support services for the application layer, such as user management services, shopping cart service
Data Layer Provide data storage access services, such as databases, caching, file, search engines, etc.

By layering, to better cut a large software system into different sections, to facilitate the development and maintenance division of labor; has a certain independence between the layers, as long as the call interface is maintained constant, the layers according to the specific problem independent evolutionary development without the need for additional layers must be adjusted accordingly.
But there are also some challenges layered architecture, is planning levels must be reasonable boundaries and interfaces in the development process strictly abide by the constraints of a layered architecture, prohibit cross-level calls (direct calls to the application layer data layer) and reverse call (call data layer service layer, or call the service layer application layer) in practice, the large internal hierarchy can continue layering, such as the application layer can be subdivided into view layer (artists responsible) and business logic (engineer); service layer It may be subdivided into layer data interface (data format adapted to the various inputs and outputs) and the logic layer.
Layered architecture is logically, physically deployed, three-layer structure can be deployed on the same physical machine, but with the development of site operations, the inevitable need to deploy separate modules have been layered, three-tier architecture that is deployed separately on different servers, the site has more computing resources to cope with more and more users access. Therefore, although the original purpose planning software layered architecture model is clear and logical structure to facilitate the development and maintenance, but in the development of the site, the hierarchical structure of the site to support high concurrent distributed development direction is essential. Therefore, the site was small time, we should adopt a layered architecture, in order to better cope with future when such a big site.

1.2 division

If the software is layered in sliced transversely terms, then the software is divided in the longitudinal segmentation regard.
The larger the site, the more complex functions, types of services and data processing of the more, these different functions and services separated, packaged into modular units high cohesion and low coupling, on the one hand contribute to the development and maintenance of software ; on the other hand, easy to deploy different modules of a distributed, concurrent processing power and functionality to improve scalability website.

Large sites divided size may be small. For example, in the application layer, the different services division, for example, shopping, forums, search, advertising is divided into different applications, by an independent team, deployed on different servers; at the same internal application, if the size of a large business complex We will continue to be divided, such as shopping service can be further divided into ticket hotel business, finer granularity 3C business, commodity business. And even at this size, can continue to be divided into Home, Search, details, and modules, regardless of logically or physically deploying, can be independent. Also in the service layer may also be required to service is divided into suitable modules.

1.3 Distributed

For large sites, a primary purpose of stratification and segmentation module is to cut the divided facilitate distributed deployment, will soon be deployed in different modules on different servers, work together through remote calls. Distributed means you can use more computers perform the same function, the more computers, the more CPU, memory, storage resources, and the amount of concurrent access to data capable of handling greater, and thus can provide more users service.
But at the same time solve sites distributed high concurrency problem has also brought other problems. First of all, distributed means that the service must be invoked by the network, which may cause more serious impact on performance; secondly, the more servers, the greater the probability that the server is down, service is down due to a server may be unavailable will lead to many applications can not access the site reduced availability; in addition, the data is also very difficult to maintain the consistency of data in a distributed environment, distributed transactions, it is difficult to guarantee that this is possible for the correctness of business websites and business processes caused great impact; distributed sites rely on complex also lead to the development management and maintenance difficult. So distributed design according to the specific circumstances and capabilities, in order to refrain from distributed and distributed.
In Web applications, the commonly used distributed approach are the following.

  • Distributed applications and services: application and service module will be distributed after the deployment of stratification and segmentation, in addition to improve site performance and concurrency, accelerate the development and release rate, reduce database resource consumption outside; also allows different applications multiplexed common services to facilitate business expansion function.

  • Distributed static resource: Web site of static resources such as Js, CSS, Logo pictures and other resources independently distributed deployment, and independent domain name, often referred to static and dynamic separation. Static resources distributed deployment can reduce the load pressure application server; by using a separate domain name to speed up the loading speed of the browser concurrent; website development is conducive to the maintenance division of labor, so that different types of technology industry specializing in surgery by the team responsible for the user experience.

  • Distributed data storage: large sites need to process massive amounts of data to P as a unit, a single computer can not provide such a large storage space, the data distributed storage needs. In addition to traditional relational databases distributed deployment, the website for the various applications born NOSQL products are almost all distributed.

  • Distributed computing: Strictly speaking, applications, services, real-time data processing are calculated, in addition to dealing with these websites online businesses, as well as a large part of the user does not feel intuitive back-office operations to deal with, including the search engine's index build, data data warehouse analysis and statistics. Calculate the scale of these operations is very large, the current site commonly used Hadoop and MapReduce distributed computing framework for such batch computing, which is characterized by mobile computing rather than moving the data will be distributed computing program to where the data is to speed up the calculation and Distributed Computing.

In addition, you can configure the server to support distributed sites online real-time updates of configuration; concurrency and distributed lock collaborative distributed environment; support cloud storage distributed file system.

1.4 Cluster

Although the use of distributed and hierarchical segmentation module has been deployed independently, but for centralized user access module (such as a website homepage), you also need to deploy a separate server cluster that multiple servers to deploy the same application constitutes a cluster , external services through a common load balancing device.
Because the server cluster have more servers providing the same service, it can provide better concurrency, as more users access the time, only need to add a new cluster of machines can be. At the same time as an application provided by a plurality of servers, when a server fails, the failure mechanism will transfer the load balancing apparatus or system request forwarded to other servers in the cluster, so that failure does not affect the server user. So in web applications, even access to a small amount of distributed applications and services, to deploy at least two servers constitute a small cluster, the purpose is to improve the usability of the system.

1.5 Cache

Cache is used to store the data in the closest position to speed up processing. Cache is the first means to improve software performance, an important factor in the modern CPU faster is to use more cache, in complex software design, the cache almost everywhere. Large site architecture design in many ways to use the cache design.

  • CDN: that is, content delivery network is deployed in the nearest distance end-user network service providers, network user requests are always the first to reach his network service providers where (less data changes) in some static resources here cache site, you can nearest returned to the user with the fastest speed, such as video sites and portals user will access a large amount of hot content cache CDN.

  • Reverse Proxy: reverse proxy architecture of the front part of the site belonging, deployed in front of the site, when the user requests reach the site data center, is the first visit to the reverse proxy server, the site where the cache static resources, without having to It continues to forward the request to the application server will be able to return to the user.

  • Local cache: a local cache in an application server with hot data, the application can access data directly in the machine's memory, without having to access the database.

  • Distributed cache: a large amount of data the site is very large, even if only a small part of the buffer, required memory space is not a stand-alone sustainable, so in addition to the local cache, but also distributed cache, the data cache in a special distributed cache cluster, the application accesses cached data via network communications.

There are two prerequisites using the cache, one for data access hotspots uneven, some data can be accessed more frequently, these data should be placed in the cache; the second is valid data in a certain period of time, does not expire soon otherwise, cached data will be due to have failed to produce dirty read, affect the validity of the results. Web applications, the cache in addition to faster access to data, but also can reduce the load pressure back-end applications and data storage, which is essential for the site database schema, database sites are almost always carried out in accordance with the design load capacity of the premise cache .

1.6 asynchronous

An important goal and the driving force of the development of computer software is to reduce software coupling. The less direct relationship between things, the less affected each other, the more they can develop independently. Large web site architecture, decoupling means of the system in addition to the aforementioned hierarchical segmentation, distribution, there is an important means of asynchronous message transmission is not synchronized between service calls, but the operation into a plurality of business stages, each stage of collaboration between asynchronous execution by way of sharing data.
Within a single server may be implemented by multithreaded asynchronous shared memory queue, traffic in front of the operation is written to the output queue thread, the thread reads back the data from the queue for processing; In a distributed system, a plurality of a cluster of servers distributed through asynchronous message queues, message queues can be seen as a distributed deployment of distributed memory queue.
Asynchronous architecture is typical of the producer-consumer model, both direct call does not exist, as long as the data structures remain unchanged, can vary the function of each other to achieve without affecting each other, which extend new features to the site very convenient. In addition, the use of the asynchronous message queue as well as characteristics.

  • Improve system availability. Consumers server fails, the data is stored accumulate in the message queue server, the server can continue to process the producer service request, trouble-free performance of the overall system. Consumers server returned to normal, continue to process the data in the message queue.

  • Site speed up response time. In the service processing server distal producer after processing the service request, the data is written to the message queue, the consumer does not need to wait for the server process can return, to reduce the response delay.

  • Elimination of peak concurrent access. Users access the site is random, there is access to the peaks and valleys, even if the site in accordance with general access peak planning and deployment, it is still unexpected events occur, such as shopping site promotion, hot events on the micro-Bo, will lead to a site concurrency access suddenly increases, which may cause the entire site overloaded, response delay, even when serious case of service outages may occur. The sudden increase in the use of the message queue access request data into the message queue, waiting for consumers to order processing server, the entire site will not load caused by too much pressure.

But note that the use of asynchronous processing business might be the user experience, business process impact, the need to support the design of the Website.

1.7 Redundancy

Site requires 7 × 24 hours of continuous running, but the server failure can occur at any time, especially when large-scale server, a server appears downtime is inevitable event. To ensure that in the case of server downtime can still continue to service the site, without loss of data, you need a certain level of server redundancy operation, data redundancy, so when a server goes down, you can serve it on data access and transferred to other machines.
Access and load small service must deploy at least two servers form a cluster, the aim is to achieve service availability through redundancy. In addition to regular database backup, archive, save for cold backup, the online business in order to ensure high availability, need to separate from the master database, real-time synchronization hot backup.
In order to withstand the website earthquakes, tsunamis and other force majeure completely paralyzed, some large sites have entire data center for backup, disaster recovery data center deployment worldwide. Real-time site procedures and data synchronization to multiple disaster recovery data center.

Automation 1.8

Website can function properly in the case of unattended, everything can be automated is the ideal state website. Currently automation architecture design of large sites focused on publishing operation and maintenance aspects.
Posted on the site are a top priority, many sites link fault lies in publishing, website engineers often work overtime because the publication is not smooth. By reducing human intervention, the release process automation can effectively reduce the failure. The publishing process includes many links. Automated code management, code version control, code branches to create a merger process automation, development engineers involved in the development by submitting their own product code, the system will automatically create a development branch, the latter will automatically merge the code; automated testing, code development completed after submitting the test, the system automatically deploy the code to the test environment, test cases start automated testing, test report is sent to the relevant personnel, feedback test results to the system; automated security testing, security testing tools for static code scanning and security deployment safe environment to test a security attack tests to assess their safety; Finally, automated deployment, project code will be automatically deployed to the production line environment.
In addition, the site may encounter a variety of problems during operation: the server is down, the program Bug, lack of storage space, the sudden outbreak of the visit Jiong peak. Online sites need to automate the production environment monitoring, the server heartbeat detection, and monitor its performance and application of key data indicators. If abnormal, exceeds a preset threshold, automated alarm, alarm information is sent to the relevant personnel, warning failure may occur. Upon detecting a failure, the system performs automatic failover, isolating the failed server from the cluster out no processing application requests in the system. After the fault is cleared, the system performs automatic failover, restart the service, data synchronization to ensure data consistency. When accessing the site encounter peak, the site exceeded the maximum processing capacity, in order to ensure the safety of the entire site available, but also automate downgrade, by rejecting part of the request and the closure of some nonessential services will load the system down to a safe level, necessary when, also need to automate the allocation of resources, we will free resources allocated to important services, to expand the scale of their deployment.

1.9 Security

The open nature of the Internet makes it since its birth on the face of enormous security challenges in the security architecture of the site has also accumulated a lot of modes: for authentication by password and mobile phone check code; operating login, transactions need to encrypt network traffic , sensitive data stored on the user information, such as web server also encrypts; robot program in order to prevent misuse of network resources attack sites, using the verification code recognition site; XSS attack for common site for attack, SQL injection, encoding respective conversion processing; spam filtering for sensitive information; other important risk trade transfer operation control mode of the transaction and transaction information.

2. Application architecture patterns in Weibo

Just a few years time the number of users of Sina Weibo to increase from zero to hundreds of millions of stars users tens of millions of fans around the Weibo is developing a set number of social, media, games, and other electricity providers one of the ecological system. Like most Web sites, Sina Weibo also grown from a small website up. Simple LAMP (Linux + Apache + MySQL + PHP) architecture, support from the initial Sina Weibo, an application developed in PHP, and all data, including microblogging, users, relations are stored in a MySQL database.

Such a simple architecture can not support Sina microblogging rapid development of business needs, with the increasing access to the user, the system overwhelmed. Xinlang reconstruction framework in a short time several times more, the final formation of the current architecture, as shown in FIG.

http://img2.mukewang.com/5da835720001041204910315.jpg

The system is divided into three levels, the lowest level is the basic service layer that provides database, cache, store, search and other data services, and other basic technical services that support the Sina Weibo massive data and high concurrent access, the whole technical basis of the system.
The middle layer is the service layer and application service platform, Sina Weibo microblogging service is the core of the relationship and the user, they are the backbone of Sina microblogging service building. These services are divided into independent service module, constitute the basis of business Weibo share basis by relying on calls and data.
The top layer is API and Sina microblogging service layer, various clients (including Web sites) and third-party applications, integrated into the AP by calling Sina microblogging system, together form an ecosystem.
The service module and the underlying technology modules distributed deployment is layered and split after each module deployed on a separate set of server cluster, be dependent on access by way of remote calls. Sina Weibo earlier also used something called MPSS (MultiPort Single Server, Single Server Multi-port) distributed cluster deployment scenarios in the cluster multiple servers, each deploying multiple services, each use different ports provide services, in this way makes the limited server can be deployed more service instances, load balancing and improved availability of services. After the application site is now common physical machines into virtual multiple virtual machines, deploy applications on a virtual machine program with MPSS program similar to Sina microblogging, but more simple, but also to use the same virtual machine on a different port number .

In the early architecture Weibo, the micro-Bo synchronous push mode, after the publication of microblogging users will immediately put this microblogging inserted into the subscription list database of all the fans, when a large amount of users, especially star when users publish it, and will cause a lot of database writes, beyond the database load, a sharp decline in system performance and user response delay intensified. Later Sina Weibo use a combination of asynchronous push-pull mode, the user will be published after Weibo microblogging write returns immediately after the message queue, the user quick response and message queues consumer task will be pushed to subscribers microblogging list of all currently online fans , the non-online users log in and then pulling the microblog subscription list according to watchlist.

Due to frequent refresh Weibo, Sina microblogging using multi-level caching strategy, popular Weibo microblogging users and star cached on all the servers microblogging, online users and recent microblog Weibo cache in a distributed cache cluster, for the microblogging operation is the most common "brush microblogging" operations, almost all cache access operation, you can get a good system performance.
In order to improve the overall availability and performance of the system, Sina microblogging enabled multiple data centers. These data centers are both regional users access center, users can access the nearest closest data center to speed up access speed, improve system performance; also a copy of the redundant data center disaster recovery, all user data and micro-blog messages by remote systems in different synchronization, system availability between data centers.
At the same time, Sina Bo has also developed a series of automated tools, including automated monitoring, automated publishing, automated fault repair, these automated tools continues to develop in order to improve operation and maintenance to improve the level of system availability.
Due to the open nature of microblogging, Sina Weibo also encountered a series of security challenges, spam, zombie powder, microblogging attacks never stopped, except the general site common security policy, Sina Weibo used on an open platform and more level security auditing policies to protect the system and the user.

3. Summary

In programming and architecture design, patterns are becoming more and more concern, hopes many people solve their own problems by mode once and for all. Correct usage patterns can make better use of thought and practice of the industry and our predecessors, to develop a better system in less time, the level designer has reached a higher level. But the scene mode is subject to applicable restrictions, requirements and constraints of the system are many, improper use mode will only draw a tiger not a like a dog, not only failed to solve the same old problem, but brought new problems more difficult.
Good design is definitely not to imitate, not a rote pattern, but a deep understanding of the issues on creativity and innovation, even "micro-innovation", is also refreshing familiar. The biggest difference between the cottage and innovation is not whether plagiarism, whether imitation, but rather whether the issues and needs to really understand and grasp.

Reference material

[1] "large-scale Web Site Technology Framework Core Principles and Case Studies" Li Zhihui


Author: Xue Qin's blog
link: http: //www.imooc.com/article/293658
Source: Mu class network
article published in the original Mu-class network, please indicate the source, thank you

Guess you like

Origin www.cnblogs.com/lguow/p/11699183.html