Distributed base (2) - large sites common architectural patterns

In this paper, a reference from "large-scale Web Site Technology Framework: Core Principles and Case Studies," a second book chapters and other online articles, if any omission or mistake, forgive us and pointed. Thank you!

Here Insert Picture Description

Good design is definitely not to imitate, not a rote pattern, but on a deep understanding of the issues of creativity and innovation, even the "micro-innovation", is also refreshing familiar. The biggest difference between the cottage and innovation is not whether plagiarism, whether imitation, but rather on whether the real problem and needs to understand and grasp.
- "large-scale Web Site Technology Framework: Core Principles and Case Studies"

1. Definition mode

"Each pattern describes a solution to the core problem of the repeated occurrence around us and the problem. In this way, you can use the program again and again without having to do repetitive work."

The key is that the pattern repeatability, problems with the repeatability of scene modes bring the solution can be reused.

23 design patterns such as the back-end development of the field is how to design a program for the previous scalability, reliability and ease of use of highly summarized, as well as many classic software frameworks reference for these classic design patterns, and in this made many innovations over.

For learning technology is the same, it is imperative to know how to use technology to solve practical problems (how?), But the more important point is to understand the technology in the end is how to solve the problem, what the idea behind it is that what mode (why? ), then apply its own technology or business. To stand on the shoulders of giants innovation.

II. General architecture model

1. stratified

Layering is the computer in a common area of ​​thought, such as a computer network, through different technical specifications and is divided into seven layers, from the lower to the upper layer service, and the upper layer can only do their own thing like, without regard to the lower is how to provide the service, just a good agreement between the upper and lower layers of the interface is enough to reduce the degree of coupling.

As in the Java Web services development process, often requires that the entire project is divided into level control layer, business layer and data layer interaction, so that they are to solve their various problems can be. Such logic can play a clear, low-dependent effect of the code

2. division

Slice can be done to project the level of segmentation, and segmentation can be regarded as a project for vertical segmentation.

Segmentation refers to dividing the project into different business lines and modules, such as a shopping site can be divided into different business home business, shopping carts, business, merchandise management services, user management services, etc., and then by different development teams.

3. Distributed

As more and more large, concurrent access to more and more applications at this time could not withstand the monomer so much pressure, so in this case there has been distributed services, a machine to process the request can not be distributed to other machines processing.

Distributed system refers to the provision of resources by a plurality of different hosts together to compose services.

Common program has distributed about several:

1. distributed applications and services: a large application into multiple small applications or services deployed separately

2. Distributed static resource: the static resource sites such as JS, CSS, images and files and other independent deployment

3. The distributed data stored: the separation of databases and applications, the deployment of the individual

4. Distributed computing: takes a lot of computing resources and services, like for example a single large deployment data analysis, machine learning and other training set

Distributed systems What are the challenges or problems occur?

1. The need to network connection between distributed hosts, the network distributed system status might be a significant impact. According to CAP theory, when a network partition (Partition) occurs, consistency (Consistent) distributed between the host data and the availability of services (Avaliable) can only guarantee a. Or we can not guarantee consistency, availability, etc. or is not guaranteed.

2. A variety of services and resources dispersed on different machines, how to ensure no loss of machine downtime and data recovery services.

4. Cluster

Distributed system, the different roles of service split into different machines above, while a cluster of things to do and distributed similar, is to increase the availability of acceptable pressure and ensure access to services; but for a cluster, the cluster the different machines offer the same service.

E.g. Redis cluster two servers, each service provided by the server are the same, the specific request of the client server process which will be determined by the need to load balancing algorithm; when one machine is down, there is still a The machine can continue to provide services.

5. Cache

Cache is the place to be able to more quickly access the data through put to speed up access.

Cache can be divided into the following categories:

1.CDN: content distribution network can forward the request to the server closest to the user, either directly return data already exists, speed up access resources

2. Reverse Proxy: usually reverse proxy server cache static resources websites (such as Nginx), you can speed up access

3. Local Cache: Cache that is present in the local cache data or applications on the local server, can be used to implement techniques ehcache

4. Distributed Cache: For a large number of requests and a lot of cache data, a more common practice is to use a distributed caching services, such as Redis cluster, etc.

Use the cache has two premises:

1. The cache frequently accessed data should be, but the change is not very frequent hot data.

2. cached data should have a certain age, but not soon expire, or will produce dirty read phenomenon

6. asynchronous processing

Large web site architecture, not only by hierarchical segmentation, decoupling and distributed manner to improve performance, can use asynchronous processing, asynchronous processing is mainly to construct a different message queue system or machine (e.g., Kafka, ActiveMQ) , communication between both ends of the message queue is referred to as the producer and consumer applications, its main role is as follows:

1. To improve system availability: imagine such a scenario, the system is processing a large number of orders, which the server suddenly goes down, then the order data may not be processed is lost, this time if using message queues to store order data, If the system goes down, then after restarting the machine, the message queue With persistence mechanism, so the message queue and data is not lost, but consumer spending and may continue to be processed, which improves system availability.

2. accelerate service response: when receiving the data producer, and placed in the message queue to return a response without waiting for the entire business process is completed, the message will be accessible to the consumer for real operating.

3. Elimination of peak concurrent access: Sometimes the system suddenly appeared above the usual access is greater than the system capacity, blocking may occur or the system goes down, such as buying other operations. At this time, the request may be queued into the message queue to be processed individually by the consumer, the application does not have an excessive load.

7. redundancy

The site due to the need to operate 24 hours, but the request is too large or for some reason can lead to a machine goes down, such as MySQL case of downtime occur, then it can not provide data services outside.

To guarantee a machine goes down, the system can still remain outside the service, then practice for data redundancy and backup, so that it can hang up after the machine quickly started to provide services to other machines.

Such as Microsoft to prevent the effects of an earthquake or a natural disaster caused by system failures, split the data into the server into thousands of deep water, so that in any emergency situation, there is backup data.

8. Automation

For all applications, the best case is fully automated, for example automated deployment code updates, the test is automated testing, the machine downtime after the switching is automatic, automated management of the code, the error can be automated alarm interfaces, greater automation can request downgrade and so on, which is now the latest developments in the direction of operation and maintenance.

9. Security

The larger the site the more vulnerable to hacker attacks, such as XSS attacks, DDos attacks, CSRF attacks, SQL injection attacks and so on, so in particular the need to ensure the safety of the scene, such as landing, payment scenarios need to introduce a verification code verification, signature verification request and the like.

Published 309 original articles · won praise 205 · Views 300,000 +

Guess you like

Origin blog.csdn.net/pbrlovejava/article/details/104865633