Architecture Design Series 4: How to Design a High-Performance Architecture

Architectural Design Series Articles

  1. Architectural Design Series 1: What is Architectural Design
  2. Architecture Design Series 2: Several Common Architecture Design Principles
  3. Architecture Design Series 3: How to Design a Scalable Architecture

In Architecture Design Series 1: What is Architecture Design , we talked about the main purpose of architecture design, which is to solve the problems caused by the complexity of software systems. Today we will talk about high performance, one of the sources of software system complexity.

Insert image description here

1. What is high-performance architecture?

To understand what high-performance architecture is , we need to first understand what high-performance is .

Definition of high performance
First, what is performance and how to understand it?
Simply put, performance refers to the processing ability of a transaction.

So, what is high performance?
High performance refers to processing a transaction faster and consuming fewer resources.

High-Performance Architecture
So, what is a high-performance architecture?
High-performance architecture refers to the use of appropriate technologies and strategies to achieve excellent system performance with limited resource investment.

For technicians, how to improve system performance with limited resource investment is both a challenge and an opportunity.
You can imagine that if you are both an architect and the architecture you build has the same performance, the cost is lower. Is that your advantage?


2. Why high-performance architecture is important

After entering the Internet era, the development speed of business is far beyond your imagination. For example:

  • In 2016, Alipay's peak number of payments per second reached 120,000 during Double 11 .
  • In 2017, the number of red envelopes sent and received on WeChat during the Spring Festival reached 760,000 per second .

We have extracted the key information: 120,000 payments are made per second, and 760,000 red envelopes are sent and received per second . These two numbers mean that there are tens of thousands, or even hundreds of millions, of users using the system at the same time.

With such a large number of users, you can imagine how much pressure the system is under, especially for complex services such as payment and red envelopes, which require high performance across the entire link. The links of a complex system are often very long. It is a complex and challenging task to ensure that every link on the link cooperates to achieve high performance.

In this way, high-performance architecture comes in handy.

We can maximize the processing speed, throughput and efficiency of the system through high-performance architecture design, thereby providing stable and reliable system services to meet large-scale, high-concurrency and complex business needs.

High-performance systems generally have the following characteristics:

  • Quick response
  • High throughput
  • low latency
  • High concurrency
  • Scalability

3. How to design a high-performance architecture

The complexity brought by high performance in software systems is mainly reflected in two aspects. On the one hand, it is the complexity brought by a single computer for high performance ; on the other hand, it is the complexity brought by multiple computer clusters for high performance .

Therefore, we can understand the high performance of software systems from two aspects: stand-alone high performance and cluster high performance .

Next, let’s discuss what common technologies can improve stand-alone performance and cluster performance.

1. Direction of efforts to improve performance

1.1 Single machine high performance

The most critical aspect of stand-alone high performance is the operating system .

The development of computer performance is essentially driven by hardware development, especially the performance development of CPUs. The famous "Moore's Law" shows that the processing power of the CPU doubles every 18 months; and the key to fully utilizing the hardware performance is the operating system.

Therefore, the operating system itself actually develops with the development of hardware. The operating system is the running environment of the software system. The complexity of the operating system directly determines the complexity of the software system.

The most relevant things between operating systems and performance are processes and threads .

1. Multi-process

How does a computer run tasks in parallel?
In the early days, computers could only perform one task at a time. If a task required reading a large amount of data from an I/O device (such as a tape), the CPU was actually idle during the I/O operation, and this idle time would have been Other calculations are possible.

In order to improve performance, a process is used to correspond to a task. Each task has its own independent memory space. The processes are not related to each other and are scheduled by the operating system. In order to achieve the purpose of parallel operation of multiple processes, a time-sharing method is adopted, that is, the CPU time is divided into many segments, and each segment can only execute instructions in a certain process.

Although it is still processed serially from the perspective of the operating system and CPU, due to the fast processing speed of the CPU, from the user's perspective, it feels like multiple processes are processing in parallel.

How to communicate between processes?
Although multi-process requires each task to have an independent memory space and the processes are not related to each other, from the user's point of view, if two tasks can communicate during the running process, the task design will become more Flexible and efficient.

Otherwise, if the two tasks cannot communicate during operation, task A can only write the results to the storage, and task B can then read from the storage for processing. This is not only inefficient, but also makes the task design more complex.

In order to solve this problem, various methods of inter-process communication have been designed, including pipes, message queues, semaphores, shared storage, etc.

2. Multi-threading

How to run tasks in parallel within a process?
Multi-process allows multi-tasks to be processed in parallel, but it has its own shortcomings. A single process can only be processed serially. In fact, many sub-tasks within a process are not required to be executed strictly in chronological order and also need to be processed in parallel.

In order to solve this problem, people invented threads. Threads are subtasks within the process, but these subtasks all share the same process data. In order to ensure the correctness of the data, a mutex lock mechanism was invented.

With multi-threading, the smallest unit of operating system scheduling becomes a thread, and a process becomes the smallest unit of operating system allocation of resources.

3. How to optimize stand-alone performance during coding?

To improve the performance of a single machine, one of the key points is the concurrency model adopted by the server, which involves multi-process, multi-thread and asynchronous non-blocking and synchronous non-blocking IO models.

IO multiplexing There
are two key points in IO multiplexing technology:

  1. When multiple connections share a blocking object, the process only needs to wait on one blocking object without polling all connections. Common implementation methods include select, epoll, kqueue, etc.
  2. When a certain connection has new data that can be processed, the operating system will notify the process, and the process returns from the blocking state and starts business processing.

Reactor and Proactor architecture patterns
In back-end system design, if we want to achieve high performance on a single machine, based on IO multiplexing, our entire network framework also needs to cooperate with pooling technology to improve our performance.

Therefore, the industry generally uses I/O multiplexing + thread pool to improve performance. Correspondingly, the two single-machine high-performance architecture patterns commonly used in the industry are the Reactor and Proactor patterns. Reactor mode is a synchronous non-blocking network model, and Proactor mode is an asynchronous non-blocking network model .

Among open source software in the industry, Redis adopts a single Reactor and single process approach, Memcache adopts a multi-Reactor and multi-thread approach, and Nginx adopts a multi-Reactor and multi-process approach.

4. Summary

If we want to complete a high-performance software system, we need to consider technical points such as multi-process, multi-thread, inter-process communication, and multi-thread concurrency.

Although multi-process and multi-threading greatly improve the performance of multi-task parallel processing, they are still essentially time-sharing systems and cannot achieve true parallelism at the same time. The way to solve this problem is to allow multiple CPUs to perform computing tasks at the same time, thereby achieving true multi-task parallelism.
Currently, the most common multi-core processor is the SMP solution. SMP, full name Symmetric Multi-Processor, symmetric multi-processor structure.


1.2 Cluster high performance

Although the performance of computer hardware has developed rapidly, it still pales in comparison to the development speed of business. Especially after entering the Internet era, the development speed of business far exceeds the development speed of hardware.

As mentioned earlier, for two complex businesses, payment and red envelopes, the performance of a single machine cannot be supported anyway, and a cluster must be used to achieve high performance. For example, for business systems of the scale of Alipay and WeChat, the number of backend system machines is in the tens of thousands.

Improving performance through a large number of machines is not just as simple as adding machines. It is a complex task to cooperate with multiple machines to achieve high performance. Common ways are:

1. Task allocation

Task distribution means that each machine can handle complete business tasks, and different tasks are assigned to different machines for execution.

The complexity of high-performance cluster design is mainly reflected in the need to add a task allocator and select an appropriate task allocation algorithm.

For task allocators, the more popular name is load balancer . But this name makes people subconsciously think that the purpose of task allocation is to keep the load of each computing unit in a balanced state. In fact, task allocation is not limited to considering the load balancing of computing units. Different task allocation algorithms have different goals. Some are based on load considerations, some are based on performance (throughput, response time) considerations, and some are based on business considerations. .

Choosing a suitable task allocator is also a complex matter, and requires comprehensive consideration of various factors such as performance, cost, maintainability, and availability.

Common classifications of task allocators are:

DNS load balancing
This is the simplest and most common load balancing method and is often used to achieve geographical level balancing.

The essence of DNS load balancing is that when users from different geographical locations access the same domain name, DNS can return different IP addresses. For example, users in the north visit the computer room in Beijing, while users in the south visit the computer room in Shenzhen. Taking www.baidu.com as an example, the address obtained by users in the north is 61.135.165.224, and the address obtained by users in the south is 14.215.177.38.

  • Advantages: simple implementation, low cost, no need to develop or maintain yourself, nearby access, fast access speed
  • Disadvantages: Untimely updates, poor scalability, simple allocation strategy

Hardware load balancing
implements the load balancing function through a separate hardware device. This type of device is similar to routers and switches and can be understood as a basic network device for load balancing. There are currently two typical models: F5 and A10.

  • Advantages: Strong performance (supports more than 1 million concurrencies), powerful functions (supports load balancing at all levels, supports comprehensive balancing algorithms)
  • Disadvantages: expensive, poor scalability

Software load balancing
implements the load balancing function by providing load balancing software. The more common ones are LVS and Nginx. LVS is the 4-layer load balancing of the Linux kernel, and Nginx is the 7-layer load balancing of the software.

In addition to using open source systems for load balancing, if the business is relatively special, it is also possible to customize based on open source systems (for example, Nginx plug-ins), or even conduct self-research.

  • Advantages: simple, cheap, flexible, both layer 4 and layer 7 load balancing can be selected and expanded according to business needs
  • Disadvantages: Compared with hardware load balancing, the performance is average and the function is not that powerful.

The main difference between software load balancing and hardware load balancing lies in performance. The performance of hardware load balancing is much higher than that of software. For example, the performance of Nginx is 10,000-level, and after installing Nginx on a general Linux server, it can reach about 50,000/second; the performance of LVS is 100,000-level, and it is said that it can reach 800,000/second; and the performance of F5 is million-level, starting from 200 Available from 10,000/second to 8 million/second.

Typical architecture of load balancing
Generally, we use a combination of three load balancing methods based on their respective advantages and disadvantages. The basic principles of the combination are: DNS load balancing realizes geographical-level load balancing; hardware load balancing realizes cluster-level load balancing; software load balancing realizes machine-level load balancing.

2. Task breakdown

Through task allocation , we can break through the bottleneck of a single machine's processing performance and add more machines to meet the performance needs of the business. However, if the business itself becomes more and more complex, performance can only be expanded through task allocation . , the income will become lower and lower.

For example, when the business is simple, if one machine is expanded to 10 machines, the performance can be increased by 8 times (part of the performance loss caused by the machine group needs to be deducted, so it cannot reach the theoretical 10 times), but if the business becomes more and more Complex, if one machine is expanded to 10, the performance may only be improved by 5 times.

The main reason for this phenomenon is that the business is becoming more and more complex, and the processing performance of a single machine will be getting lower and lower. To continue improving performance, we need to adopt a new approach: task decomposition .

The microservice architecture adopts this idea. Through this task decomposition method, the original unified but complex business system can be split into small and simple business systems that require the cooperation of multiple systems.

From a business point of view, task decomposition will neither reduce functionality nor reduce the amount of code (in fact, the amount of code may increase because calls are changed from within the code to calls through the interface between servers), so why pass Can task decomposition improve performance? There are mainly the following factors:

  1. It is easier for a simple system to achieve high performance.
    The simpler the function of the system, the fewer points that affect performance, and it is easier to carry out targeted optimization.

  2. It can be expanded for a single task.
    When each logical task is decomposed into independent subsystems, the performance bottleneck of the entire system is easier to find. After discovery, only the performance of the subsystem with the bottleneck needs to be optimized or improved, without changing the entire system. The risk will be much smaller.

Since decomposing a unified system into multiple subsystems can improve performance, is it better to divide it into finer parts?

In fact, otherwise, not only will the performance not be improved, but it will also be reduced. The main reason is that if the system is split too finely, in order to complete a certain business, the number of calls between systems will increase exponentially, and the number of calls between systems will increase exponentially. Call channels are currently transmitted through the network, and the performance is much lower than function calls within the system.

Therefore, the performance benefits brought by task decomposition have a certain degree. It is not that the more detailed the task decomposition, the better. For architecture design, how to grasp this granularity is very critical.

For specific instructions on how to decompose tasks, you can refer to the chapter on splitting in Architecture Design Series 3: How to Design a Scalable Architecture .


2. Common methods to improve performance

Next, let’s briefly analyze several common methods to improve performance.

2.1 Service-oriented design

1. What: What is servitization?

Servitization refers to splitting a complex business system into multiple small and simple business systems that need to cooperate with each other through task decomposition .

The inevitable result of servitization of large and complex systems is the centralization of business. In addition, a monolithic architecture is not necessarily a bad architecture, depending on the complexity of the application. For example, if a start-up company wants to conduct business on the Internet, since the business scale is small, the business complexity is limited, and the amount of research and development is not large, a monolithic architecture is most suitable at this time.

2. Why: Why servitization?

The purpose of servitization is to flexibly combine reusable services to quickly respond to changing business needs and support rapid business trial and error.

How to judge whether a system needs servitization? Usually we mainly need to consider the following factors:

  • Is it a large complex system?
  • Is there duplication of construction?
  • Is the business uncertain?
  • Does technology restrict enterprise development?
  • Are there performance bottlenecks in the system?

If a system has the above problems, then it is a better way to carry out overall technical upgrade and business reconstruction of the system through servitization.

3. How: How to implement servitization?

The organizational structure is adjusted in place
. In order to better implement system servitization, the adjustment of the organizational structure is a very critical step.

Because after the system is service-oriented, some team issues will be extended, such as team division of labor, collaboration, etc. Therefore, only by adjusting the organizational structure in place can the benefits of servitization be maximized.

After the service-oriented infrastructure
is service-oriented, its core emphasis is on communication between different services , which will give rise to a series of complex problems that need to be solved, such as service registration, service publishing, service invocation, load balancing, and monitoring. Wait, this requires a complete service governance solution.

Therefore, the necessary condition for servitization is to have a service-oriented framework. This service-oriented framework must be able to solve these complex problems, and the performance of the service-oriented framework is particularly important.

There are currently two mainstream service-oriented frameworks: SpringCloud and Dubbo .

Important means of servitization

  • Stateless design: Statelessness makes it easier for services to scale up and down quickly.
  • Split design: simplify the complex, reduce the difficulty, divide and conquer.
4. Summary

In one sentence: business decoupling, capability reuse, and efficient delivery .

2.2 Asynchronous design

1. What: What is asynchronous?

Asynchronous is a design concept, which is relative to synchronization.

Synchronization means that when a call is issued, the caller has to wait for the call to return the result before continuing to execute.

Asynchronous means that when a call is issued, the caller will not get the result immediately, but can continue to perform subsequent operations until the callee completes the processing, and the caller will be notified through status, notification or callback.

2. Why: Why asynchronous?

Through asynchronous, latency can be reduced, the overall performance of the system can be improved, and the user experience can be improved.

3. How: How to implement asynchronously?

1) Asynchronous at the IO level
Asynchronous calls at the IO level are what we often call the I/O model, which includes blocking, non-blocking, synchronous, and asynchronous.

In the Linux operating system kernel, there are five different IO interaction modes built in, namely blocking IO, non-blocking IO, multiplexed IO, signal-driven IO, and asynchronous IO . Regarding the network IO model, under Linux, the most commonly used model with better performance is the synchronous non-blocking model.

Common techniques for asynchronous calls

  • Asynchronous communication: NIO, Netty

2) Asynchronous process at the business logic level
Asynchronous process at the business logic level means that our application can be executed asynchronously in the business logic.

Usually more complex businesses will have many step processes. If all steps are synchronized, then when one of these steps gets stuck, the entire process will get stuck. Obviously, the performance of such a process will not be very high.

For this reason, in the industry, if we want to improve performance and concurrency, we will basically use asynchronous processes.

Common techniques for asynchronous processes

  • Message queue: asynchronous decoupling, traffic peak reduction
  • Asynchronous programming: multi-threading, thread pool
  • Event-driven: publish-subscribe pattern (observer pattern)
  • Job driver: scheduled tasks, XXL-JOB
4. Summary

One sentence summary: Although asynchronous execution efficiency is high, complexity and programming difficulty are also high, so do not abuse it .

2.3 Pooling design

1. What: What is pooling technology?

Pooling technology is a common technology to improve performance. It maintains "expensive" and "time-consuming" resources in a specific "pool" to reduce repeated creation and destruction of resources and facilitate unified management and reuse. Thereby improving system performance.

2. Why: Why is pooling technology needed?

Pooling technology is used to reduce system overhead caused by repeated creation and destruction and improve system performance.

3. How: How to implement pooling technology?

Thread Pool

  • ForkJoinPool
  • The core parameters of the ThreadPoolExecutor
    thread pool need to be set based on the business scenario. For example, the number of threads can be set based on whether the task is IO-intensive or CPU-intensive.

connection pool

  • Database connection pool
  • Redis connection pool
  • HttpClient connection pool
4. Summary

In one sentence: unified management and reuse of resources to improve performance and resource utilization .

2.4 Cache design

1. What: What is cache?

Caching is a technology that improves the speed of resource access . Its characteristics are: write once, read countless times.

The essence of caching is to trade space for time. It sacrifices the real-time nature of data and uses cached data in memory instead of reading the latest data from the target server (such as DB), which can reduce server pressure and network latency.

2. Why: Why use caching?

The purpose of using cache is obviously to improve system performance (high performance, high concurrency).

What are the advantages and disadvantages of using cache?

advantage

  • Optimize performance and shorten response time
  • Reduce stress and avoid server overload
  • Save bandwidth and alleviate network bottlenecks

shortcoming

  • consumes extra space
  • There may be data consistency issues
3. How: How to use cache?

1) How to improve resource access speed?
Place resources closer to users or on systems that can be accessed more quickly .

2) Where can cache be used?

Cache classification cache dimensions describe
client cache browser cache 1. The cache point closest to the user uses the user's terminal device to store network resources, which is the most cost-effective.
2. Generally used to cache images, JS, CSS, etc., which can be controlled through the Expires and Cache-Control attributes in the message header.
Server cache CDN cache 1. Store static resources such as HTML, CSS, JS, etc.
2. Act as a diversion to reduce the load on the source server.
Server cache reverse proxy cache 1. Separation of dynamic and static resources. Generally, static resources are cached and dynamic resources are forwarded to the application server for processing.
Server cache local cache 1. Memory cache, fast access speed, suitable for scenarios where small amounts of data are cached.
2. Hard disk caching, data caching to files, access speed is faster than obtaining data through the network, suitable for scenarios where large amounts of data are cached.
Server cache Distributed cache 1. Necessary architectural elements in large-scale website architecture.
2. Cache hotspot data to reduce database pressure.

3) Which type of cache is more costly to introduce? Why is it more expensive to introduce?
The introduction cost of local cache and distributed cache will be higher. Because the resources of these two types of caches are business-related and need to be calculated by business logic, the requirements for data consistency between the cache and the original data are higher.

4) 1 core indicator: cache hit rate.
The higher the cache hit rate, the better the performance. The calculation formula is: cache hit rate = number of hits/(number of hits + number of misses).

How to improve cache hit rate? Common strategies are as follows:

  • Cache duration: Under the same conditions, the longer the cache duration, the higher the cache hit rate.
  • Cache update: When data changes, directly updating the cache value has a higher hit rate than removing the cache.
  • Cache capacity: The larger the cache capacity, the more cached data and the higher the cache hit rate.
  • Cache granularity: The smaller the cache granularity, the smaller the data changes, and the higher the cache hit rate (reduces the risk of large keys)
  • Cache preheating: hotspot data is cached in advance to improve cache hit rate

Summary in one sentence: How to improve cache hit rate? This is to allow the data to reside in the cache for a longer period of time .

5) 1 core issue: cache consistency.
Cache consistency refers to the consistency between cache and source data. To ensure cache consistency, things will become complicated.

How to achieve cache consistency? Commonly used caching strategies are as follows:
In the Cache/DB architecture, the caching strategy is how to read and write data from Cache and DB.

1. Expiration cache mode: Cache Expiry Pattern

  • Features: The simplest way to achieve cache consistency, set the expiration time for the cache to achieve eventual consistency
  • Disadvantages: Need to tolerate data inconsistencies in the set expiration time

2. Cache Aside Pattern: Cache Aside Pattern

  • Read: Cache Hit, directly returns cached data, Cache Miss, loads data from DB to cache and returns
  • Write: write to DB first, then delete the corresponding data in Cache
  • Disadvantages: This mode may cause double-write inconsistencies between the cache and the database. You can use the delayed double-deletion mode to minimize this inconsistency.

3. Asynchronous writing mode: Write Behind Pattern

  • Read: Cache Hit, directly returns cached data, Cache Miss, directly returns empty
  • Writing: First write to DB, deliver the written new data to MQ through DB, then consume MQ by the asynchronous process and finally write the data into Cache.

Summary in one sentence: How to achieve cache consistency? This is to allow each read operation to obtain the latest write operation data .

4. Summary

One sentence summary: Caching is the king of dealing with high concurrency (caching is king) .

2.5 Data storage design

1. What: What is data storage?

Data storage usually refers to data being recorded in some format on a computer internal or external storage medium.

Common storage media include: tapes, disks, etc. How data is stored varies depending on the storage medium. Data on the tape is only accessed in sequential file mode; on the disk, sequential access or direct access can be used according to usage requirements. The data storage method is closely related to the data file organization. The key is to establish the correspondence between the logical and physical order of the records and determine the storage address to improve the data access speed.

Common data storage management systems include: database (MySQL), search engine (Elasticserach), cache system (Redis), message queue (Kafka), etc. This is also the focus of our next discussion.

2. Why: Why is data storage design important?

In the Internet era, when system concurrency reaches a certain stage, data storage often becomes a performance bottleneck. If you do not carry out a good design at the beginning, you will encounter difficulties in later horizontal expansion and sub-database and sub-table.

Why is the performance bottleneck often the data storage rather than the application service?
Because application services are basically stateless and can be easily expanded horizontally, the high performance of application services will be relatively simple. But for high performance of data storage, it is relatively more complicated because data is stateful.

3. How: How to design data storage?

Common solutions for solving storage high performance include the following. Most of the industry is built around these, or makes related derivatives and extensions.

1) Separation of reading and writing

Internet systems tend to read more and write less, so the first step in performance optimization is to separate reading and writing.

Read and write separation is an optimization method that separates read operations from write operations. We can use this technology to solve the performance bottleneck problem of data storage.

Currently, the popular read-write separation solution in the industry is usually based on the master-slave model architecture, which implements the read-write separation of access actions by introducing a data access proxy layer. Specifically, there are two ways:

Implementing read-write separation through independent Proxy
The advantage of introducing a data access proxy is that the source program can achieve read-write separation without any changes. The disadvantage is that due to the addition of an extra layer of middleware as a transfer proxy, the performance will be reduced, and data access Agents can also easily become performance bottlenecks, and there are certain maintenance costs. Typical products include MyCAT, Alibaba Cloud-RDS database agent, etc.


Another way to achieve read-write separation through embedded SDK is to front-load the data access proxy layer to the application side and integrate it with the application through the SDK, which can avoid the performance loss and maintenance costs caused by an independent layer . high problem. However, this method has certain requirements for the development language and has applicability issues. Typical products include ShardingSphere, etc.

2) Data partition

"Partitioning" refers to the process of physically dividing data into separate pieces of data for storage. Split data into partitions that can be managed and accessed independently. Partitioning can improve scalability, reduce contention, and optimize performance. Additionally, it provides a mechanism to segment data by usage patterns.

Why partition the data?

  • Improve scalability . Scaling up a single database system will eventually reach the limits of the physical hardware. If data is split across multiple partitions, each partition is hosted on a separate server, allowing the system to scale out almost infinitely.
  • Improve performance . Data access operations on each partition are performed through smaller data volumes. When done correctly, partitioning can improve the efficiency of your system.
  • Provide operational flexibility . Using partitions can optimize operations, maximize management efficiency, and reduce costs in many ways.
  • Improve usability . Isolating data across multiple servers avoids single points of failure. If an instance fails, only the data in that partition is unavailable. Operations on other partitions can continue.

How to design partitions?

Three typical strategies for data partitioning:

  • Horizontal partitioning (i.e. sharding) . In this strategy, each partition is an independent data store, but all partitions have the same schema. Each partition, called a shard, holds a specific subset of data, such as all orders for a specific set of customers. The most important factor is the choice of shard key. Sharding spreads load across multiple machines, reducing contention and improving performance.
  • Vertical partitioning . In this strategy, each partition holds a subset of the item's fields in the data store. The fields have been segmented based on their usage patterns. For example, place frequently accessed fields in one vertical partition and less frequently accessed fields in another vertical partition. The most common use of vertical partitioning is to reduce the I/O and performance costs associated with fetching frequently accessed items.
  • Functional partition . In this strategy, data has been aggregated based on how it is used by each bounded context in the system. For example, an e-commerce system might store invoice data in one partition and product inventory data in another partition. Improve isolation and data access performance through functional partitioning.

3) Sub-database and sub-table

Everyone should be familiar with the concept of sub-database and sub-table, which can be broken down into two methods: sub-database and sub-table:

  • Table splitting: refers to splitting the data in one table into multiple tables according to certain rules to reduce the size of table data and improve query efficiency.
  • Database splitting: refers to splitting the data in one database into multiple databases according to certain rules to reduce the pressure on a single server and improve read and write performance (such as: CPU, memory, disk, IO).

Two typical solutions for sharding databases and tables:

split vertically

  • Vertical table splitting : that is, a large table is split into small tables, and different "fields" in one table are split into multiple tables. For example, a product library splits basic product information, product inventory, seller information, etc. into different database tables.
  • Vertical decomposition : Split different business areas in a system into multiple business libraries. For example: product library, order library, user library, etc.

split horizontally

  • Horizontal table splitting : Split the data into multiple tables according to certain dimensions. However, since multiple tables still belong to one database, the lock granularity is reduced, which improves query performance to a certain extent, but there is still an IO performance bottleneck.
  • Horizontal database splitting : Split data into multiple databases according to certain dimensions, reducing the pressure on a single machine and database and improving read and write performance.

Common horizontal splitting methods

  • Range sub-database and table : use the range method to split the sharding key according to the range. For example: divide the database into tables according to the time range.
  • Hash database sharding and table sharding : Use hash modulo for sharding keys through hashing. For example: sub-database and table based on user ID.

For more information about sub-database and sub-table, see the article: Development and Design Practice: Sub-database and Sub-table Implementation Plan , which will not be described here.

4) Separation of hot and cold

Hot and cold separation refers to storing historical cold data and current hot data separately. The cold storage only stores the data that reaches the final state. The hot storage also stores data that needs to modify the fields. This can reduce the storage volume of the current hot data. Improve performance.

How to determine whether the data is cold data or hot data? In other words, under what circumstances can hot and cold separation be used?

  • Time dimension : Users can accept new and old data to be queried separately. For example, for order data, we can use the data three months ago as cold data and the data within three months as hot data.
  • State dimension : After the data reaches the final state, there is only reading and no writing requirements. For example, for order data, we can use completed orders as cold data and others as hot data.
4. Summary

In one sentence: through splitting, the reading and writing pressure is dispersed, and the storage pressure is dispersed, thereby improving performance .

4. Finally

While pursuing system high performance, do not ignore the cost factor. Because high performance often means high cost.

Therefore, when designing high-performance systems, special attention must be paid to minimizing costs and maximizing benefits.

Finally, as a technician, we should have a technical pursuit: learn to do more work with the same resources.

Guess you like

Origin blog.csdn.net/icansoicrazy/article/details/133623447