Go combat | Implementation of http request queuing processing

In high concurrency scenarios, in order to reduce system pressure, a mechanism for queuing requests is used. This article describes how it is implemented in Go.

First, the sequential processing method of http requests

First, let's look at the normal request processing logic. The client sends the request, the web server receives the request, then processes the request, and finally responds to the client in such a sequential logic. As shown below:01-normal request.png

The code is implemented as follows:

package main

import (
	"fmt"
	"net/http"
)

func main() {

	myHandler := MyHandler{}

	http.Handle("/", &myHandler)

	http.ListenAndServe(":8080", nil)
}

type MyHandler struct {

}

func (h *MyHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	w.Write([]byte("Hello Go"))
}

Enter http://localhost:8080/ in the browser to display the "Hello Go" page on the page.

Under normal circumstances, when people develop web systems, they generally handle requests in this way. Next, let's see how to queue requests under high concurrency.

2. Asynchronous processing of http requests - queuing processing

Let the http request enter the queue, we also call it asynchronous processing. The basic idea is to wrap the received request context (ie, request and response) and processing logic into a work unit, and then put it on the queue, and then the work unit waits for the consuming worker thread to process the job, and then the processing is completed. returned to the client. The process is as follows:02-Queue request processing.png

There will be three key elements in this implementation: the work execution unit, the queue, and the consumer. Let's take a look at their respective responsibilities and implementations one by one.

work unit

The unit of work mainly encapsulates the context information of the request (request and response), the processing logic of the request, and the status of whether the unit of work has been executed.

The processing logic of the request is actually the specific function in the original sequential processing process. If it is in the mvc mode, it is a specific action in the controller.

The way to implement communication in Go is generally through the use of channels. Therefore, there is a channel in the work unit. After the work unit executes the specific processing logic, a message is written to the channel to notify the main coroutine that the request has been completed and can be returned to the client.

So, the processing logic for an http request looks like this:

func (h *MyHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
  将w和r包装成工作单元job
  将job入队
  等待job执行完成
  本次请求处理完毕
}

Let's take a look at the specific implementation of the work unit, here we define it as a Job structure:


type Job struct {
    DoneChan  chan struct{}
    handleJob func(j FlowJob) error //具体的处理逻辑
}

Job结构体中有一个handleJob,其类型是一个函数,即处理请求的逻辑部分。DoneChan通道用来让该单元进行阻塞等待,并当handleJob执行完毕后发送消息通知的。

下面我们再看看该Job的相关行为:

// 消费者从队列中取出该job时 执行具体的处理逻辑
func (job *Job) Execute() error {
    fmt.Println("job start to execute ")
    return job.handleJob(job)
}

// 执行完Execute后,调用该函数以通知主线程中等待的job
func (job *Job) Done() {
    job.DoneChan <- struct{}{}
    close(job.DoneChan)
}

// 工作单元等待自己被消费
func (job *Job) WaitDone() {
    select {
    case <-job.DoneChan:
	return
    }
}

队列

队列主要是用来存储工作单元的。是处理请求的主协程和消费协程之间的纽带。队列具有列表、容量、当前元素个数等关键元素组成。如下:

type JobQueue struct {
    mu         sync.Mutex
    noticeChan chan struct{}
    queue      *list.List
    size       int
    capacity   int
}

其行为主要有入队、出队、移除等操作。定义如下:

// 初始化队列
func NewJobQueue(cap int) *JobQueue {
    return &JobQueue{
	capacity: cap,
	queue:    list.New(),
	noticeChan: make(chan struct{}, 1),
    }
}

// 工作单元入队
func (q *JobQueue) PushJob(job *Job) {
    q.mu.Lock()
    defer q.mu.Unlock()
    q.size++
    if q.size > q.capacity {
	q.RemoveLeastJob()
    }

    q.queue.PushBack(job)


    q.noticeChan <- struct{}{}
}

// 工作单元出队
func (q *JobQueue) PopJob() *Job {
	q.mu.Lock()
	defer q.mu.Unlock()

	if q.size == 0 {
		return nil
	}

	q.size--
	return q.queue.Remove(q.queue.Front()).(*Job)
}

// 移除队列中的最后一个元素。
// 一般在容量满时,有新job加入时,会移除等待最久的一个job
func (q *JobQueue) RemoveLeastJob() {
	if q.queue.Len() != 0 {
		back := q.queue.Back()
		abandonJob := back.Value.(*Job)
		abandonJob.Done()
		q.queue.Remove(back)
	}
}

// 消费线程监听队列的该通道,查看是否有新的job需要消费
func (q *JobQueue) waitJob() <-chan struct{} {
    return q.noticeChan
}

这里我们主要解释一下入队的操作流程:

1 首先是队列的元素个数size++

2 判断size是否超过最大容量capacity

3 若超过最大容量,则将队列中最后一个元素移除。因为该元素等待时间最长,认为是超时的情况。

4 将新接收的工作单元放入到队尾。

5 往noticeChan通道中写入一个消息,以便通知消费协程处理Job。

由以上可知,noticeChan是队列和消费者协程之间的纽带。下面我们来看看消费者的实现。

消费者协程

消费者协程的职责是监听队列,并从队列中获取工作单元,执行工作单元的具体处理逻辑。在实际应用中,可以根据系统的承载能力启用多个消费协程。在本文中,为了方便讲解,我们只启用一个消费协程。

我们定义一个WorkerManager结构体,负责管理具体的消费协程。该WorkerManager有一个属性是工作队列,所有启动的消费协程都需要从该工作队列中获取工作单元。代码实现如下:


type WorkerManager struct {
    jobQueue *JobQueue
}

func NewWorkerManager(jobQueue *JobQueue) *WorkerManager {
    return &WorkerManager{
	jobQueue: jobQueue,
    }
}

func (m *WorkerManager) createWorker() error {

    go func() {
	fmt.Println("start the worker success")
	var job FlowJob

	for {
            select {
                case <-m.jobQueue.waitJob():
		fmt.Println("get a job from job queue")
                job = m.jobQueue.PopJob()
                
		fmt.Println("start to execute job")
		job.Execute()
				
                fmt.Print("execute job done")
		job.Done()
            }
	}
    }()

    return nil
}

在代码中我们可以看到,createWorker中的逻辑实际是一个for循环,然后通过select监听队列的noticeChan通道,当获取到工作单元时,就执行工作单元中的handleJob方法。执行完后,通过job.Done()方法通知在主协程中还等待的job。这样整个流程就形成了闭环。

完整代码

我们现在看下整体的处理流程,如下图: 03-Overall process.png

现在我们写一个测试demo。在这里我们定义了一个全局的flowControl结构体,以作为队列和工作协程的管理。代码如下:

package main

import (
    "container/list"
    "fmt"
    "net/http"
    "sync"
)

func main() {
    flowControl := NewFlowControl()
    myHandler := MyHandler{
	flowControl: flowControl,
    }
    http.Handle("/", &myHandler)

    http.ListenAndServe(":8080", nil)
}

type MyHandler struct {
    flowControl *FlowControl
}

func (h *MyHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	fmt.Println("recieve http request")
	job := &Job{
            DoneChan: make(chan struct{}, 1),
            handleJob: func(job *Job) error {
		w.Header().Set("Content-Type", "application/json")
		w.Write([]byte("Hello World"))
		return nil
            },
	}

	h.flowControl.CommitJob(job)
	fmt.Println("commit job to job queue success")
	job.WaitDone()
}

type FlowControl struct {
    jobQueue *JobQueue
    wm       *WorkerManager
}

func NewFlowControl() *FlowControl {
    jobQueue := NewJobQueue(10)
    fmt.Println("init job queue success")

    m := NewWorkerManager(jobQueue)
    m.createWorker()
    fmt.Println("init worker success")

    control := &FlowControl{
	jobQueue: jobQueue,
	wm:       m,
    }
    fmt.Println("init flowcontrol success")
    return control
}

func (c *FlowControl) CommitJob(job *Job) {
    c.jobQueue.PushJob(job)
    fmt.Println("commit job success")
}

完整的示例代码可以通过git获取:http异步处理

A previous article is the priority queue, which is actually an advanced implementation version of the queue, which can assign different requests to different queues according to their priorities. Interested students can refer to: Go actual combat | This article will take you to understand the implementation from single queue to priority queue

Summarize

By encapsulating the context information of the request into a unit of work, putting it into the queue, and then blocking through the message channel to wait for the consumer to complete the execution. At the same time, by setting the capacity of the queue in the queue to solve the problem of causing pressure on the system due to excessive requests.

Guess you like

Origin juejin.im/post/7121252800469663774