[Project] spike mall - traffic clipping should be how do

If you've seen traffic spike system monitoring chart, you will find it is a straight line, at the beginning of that second spike is a very straight straight line, because the spike request in time is highly concentrated in a particular point in time. As a result, it will lead to a particularly high traffic peaks, its consumption of resources is instantaneous.

But this spike scenario, the number of people able to grab the final product is fixed, that is the result of 100 people and 10,000 people initiated the request are the same, the higher the degree of concurrency, the more invalid request.

But from the business, the spike activity is hope more people to participate, that is, began to hope there are more people to brush page before, but the real start of the next single, spike request is not better. Therefore, we can devise rules to allow more concurrent requests to delay, and we can even filter out some invalid request

Why clipping

Why systems typically spike peak cut it? Or peak which will bring harm?

We know the server's processing resources is constant, you do not have or use its processing capabilities are the same, so there are peaks, then easily lead busy time to handle, however, there is nothing but free time to process. However, due to ensure quality of service, we deal with a lot of resources can only be estimated in accordance with the busy time, and this will lead to a waste of resources. It's like there is a problem because the morning rush hour and evening peak, so with peak load shifting solution limit line.

Clipping existence, one can make the server process has become more stable, and second, you can save the cost of the server's resources.

For this spike scene, clipping essentially is more delay the requesting user, in order to reduce and filter out invalid request, to comply with it "requests as little as possible" principle.

This blog describes the traffic clipping some operational thinking: line, answer, layered filter , which in several ways is lossless (ie no loss of the requesting user) implementations

A clipping operational thinking: line up

To traffic clipping, most obvious solution is to use the message queue to buffer flow rate, synchronous call directly converted into asynchronous push indirectly, through an intermediate queue receiving a peak instantaneous flow rate at one end, the other end of the smoothing to push the message out. Here, the message queue is like "Reservoir", as storing flood upstream, downstream river cuts into the peak flow, so as to achieve the purpose of flood disaster relief.

A message queue to buffer the instantaneous flow scheme shown below:

However, if the peak flow for a period of time to reach the upper limit of the message queue of the process, for example, the message backlog of the machine reached the maximum storage space, the message queue will also be crushed, although such protection system downstream, but directly and discarding request did not make much difference. Like the face of floods, even though I am afraid there is a reservoir of no avail

In addition to the message queue, the queue similar manner there are many, for example:

  1. Use the thread pool to wait for the lock is also a common queuing
  2. FIFO implementations, last-out and other commonly used memory queuing algorithm
  3. Serialize the request to a file, and then sequentially reads the file (e.g., MySQL binlog based synchronization mechanisms) to resume request, etc.

Can see, these methods have a common feature, that is, the "step of the operation" to "two operations", which operates to increase the step acts as a cushion

Clipping operational thinking two: the answer

Increase in answer function main purpose is to increase the complexity of purchase, so as to achieve the following two objectives:

The first purpose is to prevent some buyers to use spike is cheating while attending spike . 2011 spike is the fire when the spike is also more rampant, and thus did not meet the purpose of public participation and marketing, so the system adds the answer to limit the spike device; the increased answer, the timing of the basic control after 2s, spike device the single ratio also declined significantly, answer the page of the item as shown below:

The second purpose is actually delaying the request, the request to play the role of traffic clipping, which enable the system to better support instantaneous peak traffic . This important function is to request an elongated single peak, extending from the previous 1s to 2s ~ 10s. Thus, the time the request based on the peak of the slice, the time slice is very important to concurrent processing server, it will greatly reduce the pressure. Moreover, since the request with the order, when requested by the arrival naturally there is no inventory, so simply can not get the last single step, so the real concurrent write is very limited. This design idea is currently used very common, such as the year Alipay "shoop shoop a" micro-channel "Shake" is similar to the way

Answer spike design ideas refer to the diagram:

As shown above, the entire logical answer spike is divided into three parts:

  1. 题库生成模块,这个部分主要就是生成一个个问题和答案,其实题目和答案本身并不需要很复杂,重要的是能够防止由机器来算出结果,即防止秒杀器来答题
  2. 题库的推送模块,用于在秒杀答题前,把题目提前推送给详情系统和交易系统。题库的推送主要是为了保证每次用户请求的题目是唯一的,目的也是防止答题作弊
  3. 题目的图片生成模块,用于把题目生成为图片格式,并且在图片里增加一些干扰因素。这也同样是为防止机器直接来答题,它要求只有人才能理解题目本身的含义。这里还要注意一点,由于答题时网络比较拥挤,我们应该把题目的图片提前推送到CDN上并且要进行预热,不然的话当用户真正请求题目时,图片可能加载比较慢,从而影响答题的体验

真正答题的逻辑比较简单,很好理解:当用户提交的答案和题目对应的答案做比较,如果通过了就继续进行下一步的下单逻辑,否则就失败

验证的逻辑如下图所示:

注意,这里面的验证逻辑,除了验证问题的答案以外,还包括对用户本身身份的验证,例如是否已经登录、用户的Cookie是否完整、用户是否重复频繁提交等

除了做正确性验证,我们还可以对提交答案的时间做些限制,例如从开始答题到接受答案要超过1s,因为小于1s是人为操作的可能性很小,这样也能防止机器答题的情况

削峰操作思路三:分层过滤

前面介绍的排队和答题要么是少发请求,要么对发出来的请求进行缓冲,而针对秒杀场景还有一种方法,就是对请求进行分层过滤,从而过滤掉一些无效的请求。分层过滤其实就是采用“漏斗”式设计来处理请求的,如下图所示:

假如请求分别经过CDN、前台读系统(如商品详情)、后台系统(如交易系统)和数据库这几层,那么:

  • 大部分数据和流量在用户浏览器或者CDN上获取,这一层可以拦截大部分数据的读取;
  • 经过第二层(即前台系统)时数据(包括强一致性的数据)尽量得走Cache,过滤一些无效的请求;
  • 再到第三层后台系统,主要做数据的二次检验,对系统做好保护和限流,这样数据量和请求就进一步减少;
  • 最后在数据层完成数据的强一致性校验。

这样就像漏斗一样,尽量把数据量和请求量一层一层地过滤和减少了

分层过滤的核心思想是:在不同的层次尽可能地过滤掉无效请求,让“漏斗”最末端的才是有效请求;而要达到这种效果,我们就必须对数据做分层的校验

分层校验的基本原则是:

  1. 将动态请求的读数据缓存(Cache)在Web端,过滤掉无效的数据读;
  2. 对读数据不做强一致性校验,减少因为一致性校验产生瓶颈的问题;
  3. 对写数据进行基于时间的合理分片,过滤掉过期的失效请求;
  4. 对写请求做限流保护,将超出系统承载能力的请求过滤掉;
  5. 对写数据进行强一致性校验,只保留最后有效的数据。

分层校验的目的是:

在读系统中,尽量减少由于一致性校验带来的系统瓶颈,但是尽量将不影响性能的检查条件提前,如用户是否具有秒杀资格、商品状态是否正常、用户答题是否正确、秒杀是否已经结束、是否非法请求、营销等价物是否充足等;在写数据系统中,主要对写的数据(如库存)做一致性检查,最后在数据库层保证数据的最终准确性(如库存不能减为负数)

总结

本篇博客介绍了如何在网站面临大流量冲击时进行请求的削峰,并主要介绍了削峰的3种处理方式:

一个是通过队列来缓冲请求,即控制请求的发出;

一个是通过答题来延长请求发出的时间,在请求发出后承接请求时进行控制,最后再对不符合条件的请求进行过滤;

最后一种是对请求进行分层过滤。

其中,队列缓冲方式更加通用,它适用于内部上下游系统之间调用请求不平缓的场景,由于内部系统的服务质量要求不能随意丢弃请求,所以使用消息队列能起到很好的削峰和缓冲作用

而答题更适用于秒杀或者营销活动等应用场景,在请求发起端就控制发起请求的速度,因为越到后面无效请求也会越多,所以配合后面介绍的分层拦截的方式,可以更进一步减少无效请求对系统资源的消耗

分层过滤非常适合交易性的写请求,比如减库存或者拼车这种场景,在读的时候需要知道还有没有库存或者是否还有剩余空座位。但是由于库存和座位又是不停变化的,所以读的数据是否一定要非常准确呢?其实不一定,你可以放一些请求过去,然后在真正减的时候再做强一致性保证,这样既过滤一些请求又解决了强一致性读的瓶颈。

不过,在削峰的处理方式上除了采用技术手段,其实还可以采用业务手段来达到一定效果,例如在零点开启大促的时候由于流量太大导致支付系统阻塞,这个时候可以采用发放优惠券、发起抽奖活动等方式,将一部分流量分散到其他地方,这样也能起到缓冲流量的作用

发布了133 篇原创文章 · 获赞 94 · 访问量 3万+

Guess you like

Origin blog.csdn.net/weixin_42687829/article/details/104455580