Current limit realization (1)

In actual business, burst traffic is often encountered. If the company's infrastructure is not well done, the service cannot automatically expand and shrink. In the event of sudden high traffic, the service will collapse due to excessive pressure. Even more frightening is that service collapse is like dominoes. A service failure may affect the business of all groups of the entire company.

​ In order to reduce losses, the more commonly used solutions are current limiting and fusing. This article will describe several commonly used methods:

  • Random refusal
  • Counter method
  • Current limit based on sliding time window
  • Funnel algorithm
  • Token bucket algorithm
  • Hystrix

Hystrix is ​​mainly used for fuse processing, but from another dimension, it can also be considered as a current limiting function, so we will discuss it together. All of these methods will be implemented in Go. Given my limited time, I will write them in three to four issues. This issue first implements random refusal, counter mode, and current limitation based on a sliding time window.

The Gin framework used by Go and the introduction of swagger, if you are not familiar with it, you can take a look at these three articles of mine

  1. Gin source code analysis https://mp.weixin.qq.com/s/g_MQldFaMmvnhSsCIMJ7xg
  2. Gin simple implementation https://mp.weixin.qq.com/s/X9pyPZU63j5FF4SDT4sHew
  3. Gin framework integration swagger process

The specific code can be viewed https://github.com/shidawuhen/asap/tree/master

A simple unit test and performance test are done in the code. You can execute the following command to see the effect

go test controller/limit/randomReject_test.go

go test -v -test.bench controller/limit/slidewindowsReject*

There are two types of current limiting: distributed current limiting and single-machine current limiting. Distributed current limiting considers the cluster as a whole, and generally requires other distributed tools, such as Redis, etc., single-machine current limiting is to do each machine alone Independent physical unit. Both methods have their own advantages and disadvantages. The complexity of distributed current limiting will be higher, and it is generally necessary to ensure that the machine time is consistent. The single-machine current limiting is relatively simple to implement, but the current limiting threshold needs to be changed after the expansion and contraction.

Random refusal and sliding window-based refusal use a single-machine current-limiting scheme, while technical current-limiting uses a distributed scheme.

The following is the realization of the three schemes:

Random refusal

Random refusal codes are the simplest and often the most effective.

The general service architecture is shown in the following figure. There is a client and a BFF layer that directly interacts with the client. Many services within the company are hidden behind the BFF.

Insert picture description here

Random refusal is generally used between the client and the BFF layer. The business group in charge of the BFF layer, when the traffic is found to be uncontrollable, will turn on random flow rejections in proportion to ensure the normal operation of production and also protect the back-end services when part of the business is affected.

The code is implemented as follows:

package limit

import (
   "github.com/gin-gonic/gin"
   "math/rand"
   "net/http"
)

// @Tags limit
// @Summary 随机拒流
// @Produce  json
// @Success 200 {string} string "成功会返回ok"
// @Failure 502 "失败返回reject"
// @Router /limit/randomreject [get]
func RandomReject(c *gin.Context) {
    
    
   refuseRate := 200
   if refuseRate != 0 {
    
    
      temp := rand.Intn(1000)
      if temp <= refuseRate {
    
    
         c.String(http.StatusBadGateway, "reject")
         return
      }
   }
   c.String(http.StatusOK, "ok")
}

Note: RefuseRate usually reads the configuration data. Through the value of refuseRate, the ratio of refusal can be controlled. There have been many times of excessive traffic on the line, and after the random rejection switch is started, the effect is remarkable.

counter

The counter uses a distributed solution. Redis is used here. Redis needs to be installed when debugging on the machine. Perform the following operations:

  1. Modify #requirepass foobared in redis.conf to set auth, and the file needs to be loaded at startup after modification
  2. Start the redis command: redis-server /usr/local/Cellar/redis/6.0.1/.bottle/etc/redis.conf
  3. Login command: redis-cli -h 127.0.0.1 -a 111111

The code is implemented as follows:

package limit

import (
	"asap/aredis"
	"fmt"
	"github.com/gin-gonic/gin"
	"net/http"
	"time"
)

// @Tags limit
// @Summary 计数拒流,每秒超过指定次数会拒流掉
// @Produce  json
// @Success 200 {string} string "成功会返回ok"
// @Failure 502 "失败返回reject"
// @Router /limit/countreject [get]
func CountReject(c *gin.Context) {
    
    
	currentTime := time.Now().Unix()
	key := fmt.Sprintf("count:%d", currentTime)
	limitCount := 1
	fmt.Println(key)
	trafficCount, _ := aredis.GetRedis(aredis.BASEREDIS).Incr(key)
	if trafficCount == 1 {
    
    
		aredis.GetRedis(aredis.BASEREDIS).Expire(key, 86400)
	}
	if int(trafficCount) > limitCount {
    
    
		c.String(http.StatusOK, "reject")
		return
	}
	c.String(http.StatusOK, "ok")
}

Description:

  1. Reids are needed to implement the counter algorithm, which can ensure that the number of times the entire service is accessed per second does not exceed the specified value
  2. Use timestamp as key
  3. This scheme has a disadvantage. It is possible that up to 2 times the number of requests will be processed within 1 second. For example, in the second half of the current second, the processing number is full, and in the first half of the next second, the processing number is Full, so that twice as many requests are processed within 1s. In some extreme cases, the service will still crash.

Of course, although there is the problem mentioned above, it is generally not a big problem, because most of the traffic is relatively uniform. However, there is a scene that needs everyone to pay attention to. If there are users who brush the interface, using this method will cause a large number of normal users to be unable to use the system normally. In this case, you can use a random scheme. In addition, you can quickly find users and block them. This naturally also requires robust infrastructure support.

Current limit based on sliding time window

The counter method may cause the traffic within 1s to exceed the specified threshold. Using a current limiting solution based on a sliding time window can solve this problem.

Insert picture description here

The logic of the sliding time window is very simple, that is, the unit time is divided into smaller blocks, and the value exceeds the threshold during the sliding unit time. For example, if we split 1s into 10 small blocks, the length of each small block is 100ms. Assuming that we set the threshold to 500/s, in extreme cases, there is no traffic in the first nine small blocks of the first second. The tenth block has 500 traffic, and the time moves back. The first block in the second second, because it is in a sliding window with the tenth block in the first second, so all traffic will be rejected. Only when the time moves to the tenth block of the second second, the system can continue to process the request.

The code is implemented as follows:

package limit

import (
   "container/ring"
   "github.com/gin-gonic/gin"
   "net/http"
   "sync"
   "sync/atomic"
   "time"
   "fmt"
)

var (
   limitCount int = 5 // 1s限频
   limitBucket int = 10 // 滑动窗口个数
   curCount int32 = 0  // 记录限频数量
   head *ring.Ring     // 环形队列(链表)
   printRes = 0
)

func init(){
    
    
   // 初始化滑动窗口
   head = ring.New(limitBucket)
   for i := 0; i < limitBucket; i++ {
    
    
      head.Value = 0
      head = head.Next()
   }
   // 启动执行器
   go func() {
    
    
      //ms级别,limitBucket int = 10意味将每秒分为10份,每份100ms
      timer := time.NewTicker(time.Millisecond * time.Duration(1000/limitBucket))
      for range timer.C {
    
     // 定时每隔指定时间刷新一次滑动窗口数据
         //subCount的作用,是因为当移动到head的时候,意味着该head要被废弃了。所以总count的值需要减去
         //head的值,并将head的值重新赋值为0
         subCount := int32(0 - head.Value.(int))
         newCount := atomic.AddInt32(&curCount, subCount)

         arr := make([]int,limitBucket)
         for i := 0; i < limitBucket; i++ {
    
     //打印出当前每个窗口的请求数量
            arr[i] = head.Value.(int)
            head = head.Next()
         }
         if printRes == 1 {
    
    
            fmt.Println("move subCount,newCount,arr", subCount, newCount,arr)
         }
         head.Value = 0
         head = head.Next()
      }
   }()
}
// @Tags limit
// @Summary 滑动窗口计数拒流,每秒超过指定次数会拒流掉
// @Produce  json
// @Success 200 {string} string "成功会返回ok"
// @Failure 502 "失败返回reject"
// @Router /limit/slidewindowsreject [get]
func SlideWindowsReject(c *gin.Context){
    
    
   n := atomic.AddInt32(&curCount, 1)
   if n > int32(limitCount) {
    
     // 超出限频
      atomic.AddInt32(&curCount, -1) //将多增加的数据减少
      c.String(http.StatusBadGateway, "reject")
   } else {
    
    
      mu := sync.Mutex{
    
    }
      mu.Lock()
      pos := head.Prev()
      val := pos.Value.(int)
      val++
      pos.Value = val
      mu.Unlock()
      c.String(http.StatusOK, "ok")
   }
}

Description:

  1. This algorithm uses go's ring implementation
  2. Every time the head moves, it means that the current head has expired. You need to subtract the head count from the total count and reset the head value to 0
  3. When changing the total count, atomic operations are used to ensure the accuracy of the count
  4. When changing the count in the ring, you need to use a lock to prevent data inconsistency. The performance pressure test results show that the performance is available.
  5. This algorithm uses a timer, which is a more tricky solution

data

  1. https://www.cnblogs.com/xiangxiaolin/p/12386775.html
  2. https://jingyan.baidu.com/article/d5a880ebdbed2113f047cc4e.html Install redis service
  3. https://www.h3399.cn/201906/702263.html Set auth
  4. https://blog.csdn.net/Hedon954/article/details/107146301/ Detailed process of installing Redis on Mac and configuring password
  5. https://blog.csdn.net/gx864102252/article/details/102213616 Go implements sliding window frequency limit
  6. http://docscn.studygolang.com/pkg/container/ring/
  7. Comparison of frequency limiting schemes
  8. https://blog.csdn.net/micl200110041/article/details/82013032 Several ways to implement request current limit in Golang
  9. https://blog.csdn.net/u014691098/article/details/105601511 Sliding window to achieve access frequency restriction
  10. Sliding window method of current limiting algorithm
  11. https://zhuanlan.zhihu.com/p/85166364

At last

If you like my article, you can follow my public account (Programmer Mala Tang)

Review of previous articles:

  1. Current limit realization 1
  2. Some thoughts on product managers
  3. Redis implements distributed locks
  4. Golang source code bug tracking
  5. The realization principle of transaction atomicity, consistency and durability
  6. How to exercise your memory
  7. Detailed explanation of CDN request process
  8. Thoughts on the career development of programmers
  9. The history of blog service being crushed
  10. Common caching techniques
  11. How to efficiently connect with third-party payment
  12. Gin framework concise version
  13. Thinking about code review
  14. A brief analysis of InnoDB locks and transactions
  15. Markdown editor recommendation-typora

Guess you like

Origin blog.csdn.net/shida219/article/details/107446920