High Concurrency and High Availability Architecture Practice - Design Concept (1)

High Concurrency and High Availability Architecture Practice

Original blog address: https://blog.csdn.net/taotoxht/article/details/46931045

1. Design Concept

1.      Space for time

1)      Multi-level cache, static

Client page cache ( The http header contains Expires/Cache of Control, last modified (304, the server does not return body, the client can continue to use cache to reduce traffic ), ETag)

reverse proxy cache

Application-side caching (memcache)

Memory Database

Buffer, cache mechanism (database, middleware, etc.)

2)      Index

Hash, B-tree, inverted, bitmap

The hash index is suitable for the addressing of the integrated array and the insertion characteristics of the linked list, and can realize the fast access of data.

B-tree indexes are suitable for query-oriented scenarios, avoiding multiple IOs and improving query efficiency.

Inverted index is the best way to realize the mapping relationship between words and documents and the most efficient index structure, and is widely used in the field of search.

Bitmap is a very simple and fast data structure, which can optimize storage space and speed at the same time (without having to exchange space for time), and is suitable for computing scenarios with massive data.

 

2.      Parallel and Distributed Computing

 

1)      Task segmentation, divide and conquer (MR) - based on data decomposition

 

In large-scale data, the data has certain local characteristics, and the problem of massive data calculation is divided and conquered by using the principle of locality .

 

The MR model is a shared-nothing architecture, and the dataset is distributed to each node. During processing, each node reads the locally stored data for processing (map), combines , shuffles and sorts the processed data and distributes it (to the reduce node ), avoiding the transmission of a large amount of data. , improving the processing efficiency.

 

2)      Multi-process, multi-thread parallel execution (MPP)--based on problem decomposition

 

Parallel computing refers to the process of using multiple computing resources to solve computing problems at the same time, and it is an effective means to improve the computing speed and processing capacity of computer systems. Its basic idea is to use multiple processors /processes /threads to solve the same problem collaboratively, that is, to decompose the problem to be solved into several parts, and each part is calculated in parallel by an independent processor .

The difference with MR is that it is based on problem decomposition, not data decomposition.

3.      Availability of multiple dimensions

1)      Load balancing, disaster recovery, backup

As the concurrency of the platform increases, it is necessary to expand nodes for clustering, and use load balancing equipment to distribute requests ; load balancing equipment usually provides load balancing while also providing failure detection functions; at the same time, in order to improve availability, it is necessary to have disaster tolerance Backup to prevent unavailability problems caused by node failures; backups include online and offline backups , and different backup strategies can be selected according to different failure requirements.

2)      Separation of read and write

The separation of reading and writing refers to the database. With the increase of system concurrency, an important means to improve the availability of data access is to separate writing data and reading data ; of course, while reading and writing separation, it is necessary to pay attention to the consistency of data Problems; For consistency problems, in distributed system CAP quantification, more attention is paid to availability.

3)      Dependencies

The relationship between various modules in the platform is as low-coupling as possible. It can interact through related message components. If it can be asynchronous, it can be asynchronous . The main process and sub-process of data flow can be clearly distinguished. The main and secondary processes are asynchronous. For example, the recording log can be Asynchronous operation increases the availability of the entire system.

Of course, in asynchronous processing, a confirmation mechanism (confirm, ack) is often required to ensure that data is received or processed .

However, in some scenarios, although the request has been processed, the confirmation message has not been returned due to other reasons (such as network instability ). In this case, the request needs to be retransmitted, and the request processing design needs to be considered due to the retransmission factor. Idempotency.

4)      Monitoring

Monitoring is also an important means to improve the usability of the entire platform. Multi-platform monitoring is performed in multiple dimensions; modules are transparent at runtime to achieve white-box operation during runtime.

4.     伸缩

1)     拆分

拆分包括对业务的拆分和对数据库的拆分

系统的资源总是有限的,一段比较长的业务执行如果是一竿子执行的方式,在大量并发的操作下,这种阻塞的方式,无法有效的及时释放资源给其他进程执行,这样系统的吞吐量不高。

需要把业务进行逻辑的分段,采用异步非阻塞的方式,提高系统的吞吐量。

随着数据量和并发量的增加,读写分离不能满足系统并发性能的要求,需要对数据进行切分,包括对数据进行分库和分表。这种分库分表的方式,需要增加对数据的路由逻辑支持。

2)     无状态

对于系统的伸缩性而言,模块最好是无状态的,通过增加节点就可以提高整个的吞吐量。

5.     优化资源利用

1)     系统容量有限

系统的容量是有限的,承受的并发量也是有限的,在架构设计时,一定需要考虑流量的控制,防止因意外攻击或者瞬时并发量的冲击导致系统崩溃。在设计时增加流控的措施,可考虑对请求进行排队,超出预期的范围,可以进行告警或者丢弃。

2)     原子操作与并发控制

对于共享资源的访问,为了防止冲突,需要进行并发的控制,同时有些交易需要有事务性来保证交易的一致性,所以在交易系统的设计时,需考虑原子操作和并发控制。

保证并发控制一些常用高性能手段有,乐观锁、Latch、mutex、写时复制、CAS等;多版本的并发控制MVCC通常是保证一致性的重要手段,这个在数据库的设计中经常会用到。

3)     基于逻辑的不同,采取不一样的策略

平台中业务逻辑存在不同的类型,有计算复杂型的,有消耗IO型的,同时就同一种类型而言,不同的业务逻辑消耗的资源数量也是不一样的,这就需要针对不同的逻辑采取不同的策略。

针对IO型的,可以采取基于事件驱动的异步非阻塞的方式,单线程方式可以减少线程的切换引起的开销,或者在多线程的情况下采取自旋spin的方式,减少对线程的切换(比如oracle latch设计);对于计算型的,充分利用多线程进行操作。

同一类型的调用方式,不同的业务进行合适的资源分配,设置不同的计算节点数量或者线程数量,对业务进行分流,优先执行优先级别高的业务。

4)     容错隔离

系统的有些业务模块在出现错误时,为了减少并发下对正常请求的处理的影响,有时候需要考虑对这些异常状态的请求进行单独渠道的处理,甚至暂时自动禁止这些异常的业务模块。

有些请求的失败可能是偶然的暂时的失败(比如网络不稳定),需要进行请求重试的考虑。

5)     资源释放

系统的资源是有限的,在使用资源时,一定要在最后释放资源,无论是请求走的是正常路径还是异常的路径,以便于资源的及时回收,供其他请求使用。

在设计通信的架构时,往往需要考虑超时的控制。

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325948248&siteId=291194637