The core principle of the technical architecture of large sites and case studies - the study notes

Chapter One: the evolution of large sites Architecture

Large Internet system has the following features :

  • High concurrency, high flow
  • High Availability: 7 * 24 hours of uninterrupted service
  • Massive data: storing manage large amounts of data, you need to use a lot of servers.
  • Users widely distributed, network complexity
  • Poor security environment: xss \ SQL injection \ CSRF \ etc
  • Demand for rapid change, frequent release: office in years more. General large sites in weeks more, more SMEs in days.
  • Progressive development: from small to big

development path

The first stage: LAPM

A small site at the beginning, when the user visits a small amount to achieve the business simple, usually one server is enough.

All resources applications, databases, files are placed on a server.

The usual pattern LAMP, Linux system, applications developed using PHP, and then deployed on APache, database

Use MySQL. Bringing together a variety of free open-source software and an inexpensive servers can begin the road of development sites.

The second stage: three servers

With the increase site traffic, a server can not meet, the increasing amount of users and access more and more data. At this moment

The business needs and data separation. After separation of the entire site using the three servers, application servers, data servers, file servers.

Three servers demand on hardware resources vary. Application server, you need to handle a large number of business, higher demands on the CPU, data server needs to query and data cache,

Greater need for faster disk, file server needs to store a lot of files, requiring a larger hard drive.

Separation applications and data, use different servers support different service roles, the site of concurrent capacity and storage space data has been improved.

Database too much pressure led to delays, affecting performance, user experience feeling good, this time needs to be optimized.

The third stage: Cache

Data access to the site follow a pattern: Eighty percent of business visit focused on the 20% of the data. Taobao sellers of views focused on the number of minority turnover of more evaluation on a good product, so

Can 20% of the data cached in memory, faster access.

Cache site uses two kinds: one is the local cache server, and the other is to use remote distributed cache. The first situation may occur shortage of local cache servers and may be cached and

Applications grab memory, then you can consider remotely distributed cache. Using cluster way to deploy large memory cache server as a dedicated server.

Solve the cache problem, this time a single server can not resolve multiple requests, the peak of the site visit, the application server becomes the bottleneck.

Phase IV: the application server cluster

Use Cluster is a common way to address the high concurrency and mass data. When a service access request can not be met and storage, do not consider using a larger server, and then a big server also can not meet the site

Ever-increasing business demands. Increase the number of servers, multiple servers to access shared storage pressure and stress.

Site framework, as long as it can improve by adding a server load pressure, can be jealous of in the same way to increase server continually improve system performance, application server cluster is a website achieve

Scalable cluster architecture in a simple and sophisticated way.

Access via load balancing server, from a user's browser requests distributed to any one application server in a cluster, the cluster increased pressure to resolve the problem server load.

Fifth stage: a database separate read and write

After using the cache, there are still read section (cache miss, the cache expires) and all write operations to be handled by the database. After the number of users reaches a certain size, the database will become a bottleneck.

Consider this case, separate read and write the database. When accessing the database, written to the primary database, updated by the master, when read by the reading from the database from the database. Read database achieve separation.

Stage 6: accelerated access

CDN and reverse proxy access response may accelerate the speed of the user.

The basic principle is CDN and caching reverse proxy. CDN is deployed in the network provider's room, so that when a user requests a Web site, you can get data from the network provider from their recent room.

The direction of the agent is deployed in the central office site, when users request reaches the engine room, the server first visit was reverse proxy server, if the proxy cache which direction the user's request resources directly returned to the user.

CDN using a reverse proxy and may request the user's response speed and reduce the load pressure of back-end servers.

Seventh stage: the use of distributed file systems and databases

Use database replication from the primary site is unable to meet the growing business needs. It is the need of distributed database and distributed file systems.

Database last resort when split distributed database. Only the amount of data in a single table is very Oh Onda when it is used commonly used way is to split the business into a database sub-library

It does not make sense business database deployed on different physical servers.

Eighth stage: non-relational databases and search engines

Increase retrieval speed

Ninth stage: Business Split

Tenth stage: distributed services

Split against business getting smaller and smaller, more and more large storage systems, increase the overall complexity of the application system, the deployment of maintenance more difficult, all applications and databases must be connected to the database might gnaw denial of service.

Each application needs to perform many of the same business operations, such as user management and merchandise management, it may extract these common operations out, independent deployment, these reusable service connection data, providing a common business services, and application of the system as long as the user management interface, the distributed service call by a common business services accomplish specific business operations.

 

Slag slag summary

From the above description, it can be seen from the lack of storage space and processing performance of a service, extended to three servers to handle different modules, to add extra memory to increase the speed of data access, application server clusters to solve the problem of high concurrency, separate read and write to the cluster database to further reduce the pressure on the database caching, CDN and reverse proxy to speed up response, distributed solve management problems.

Application: stand-alone server -> Multi-Server -> Server Clusters -> Distributed Management

Database: - Cache -> separate read and write -> database cluster -> Distributed Management

Cache: local cache -> Distributed Cache

Early use of CND and reverse proxy accelerate the reaction rate, the use of the communication between the distributed message queue management module.

The second chapter describes achieve some common patterns of large sites.

 

Guess you like

Origin www.cnblogs.com/bowenqianngzhibushiwo/p/11619674.html