Golang coroutine pool Ants implementation principle, with detailed graphic description and code

Golang coroutine pool Ants implementation principle, with detailed graphic description and code.

1 Prerequisite knowledge points
1.1 sync.Locker
sync.Locker is the lock interface defined under the go standard library sync:

// A Locker represents an object that can be locked and unlocked.
type Locker interface {
    
    
    Lock()
    Unlock()
}

Any class that implements the Lock and Unlock methods can be used as a lock implementation. The most common one is sync.Mutex implemented by the go standard library.

In ants, the author does not want to use heavy locks such as Mutex, but custom-implements a lightweight spin lock:

Insert image description here
The lock implementation principle:

• Identifies the status of the lock through an integer status value: 0-unlocked; 1-locked;

• When locking is successful, 0 is changed to 1; when unlocking, 1 is changed to 0; the rewriting process uses the atomic package to ensure concurrency safety;

• Locking realizes spin through for loop + cas operation, without the intervention of the operating system to perform park operation;

• The intensity of lock grabbing is reflected through the variable backoff. Each time the lock grabbing fails, backoff is executed several times to allow the CPU to take time slice actions; backoff gradually escalates with the number of failures and is capped at 16.

type spinLock uint32
const maxBackoff = 16

func (sl *spinLock) Lock() {
    
    
    backoff := 1
    for !atomic.CompareAndSwapUint32((*uint32)(sl), 0, 1) {
    
    
        for i := 0; i < backoff; i++ {
    
    
            runtime.Gosched()
        }
        if backoff < maxBackoff {
    
    
            backoff <<= 1
        }
    }
}

func (sl *spinLock) Unlock() {
    
    
    atomic.StoreUint32((*uint32)(sl), 0)

1.2 sync.Cond
sync.Cond is a concurrent coordinator provided by the golang standard library, which is used to implement blocking and waking up coroutines under specified conditions.

Insert image description here
1.2.1 Data structure and constructor method

type Cond struct {
    
    
    noCopy noCopy

    // L is held while observing or changing the condition
    L Locker

    notify  notifyList
    checker copyChecker
}

// NewCond returns a new Cond with Locker l.
func NewCond(l Locker) *Cond {
    
    
    return &Cond{
    
    L:

• The member variable noCopy + checker is a combination punch that ensures that Cond is not allowed to be copied after the first use;

• The core variable L, a lock, is used to implement blocking operations;

• The core variable notify, a blocking linked list, stores the number of times the Cond.Wait() method is called, the number of times the goroutine is awakened, a system runtime mutex, and the head and tail nodes of the linked list.

type notifyList struct {
    
    
    wait   uint32
    notify uint32
    lock   uintptr // key field of the mutex
    head   unsafe.Pointer
    tail   unsafe.Pointer
}

1.2.2 Cond.Wait

func (c *Cond) Wait() {
    
    
    c.checker.check()
    t := runtime_notifyListAdd(&c.notify)
    c.L.Unlock()
    runtime_notifyListWait(&c.notify, t)
    c.L.Lock()
}

• Check whether Cond has been copied after use, if so, panic;

• This Cond blocks the wait statistics of the linked list and adds 1;

• The current coroutine releases the lock because it will be parked by the operating system next;

• Package the current coroutine into a node, add it to Cond's blocking queue, and call the park operation to suspend the current coroutine;

• After the coroutine is awakened, try to acquire the lock again.

1.2.3 Cond.Signal

func (c *Cond) Signal() {
    
    
    c.checker.check()
    runtime_notifyListNotifyOne(&c.notify)
}

• Check whether Cond has been copied after first use, if so, panic;

• The Cond blocking linked list notify statistics are increased by 1;

• Traverse the blocking linked list from the beginning and wake up the goroutine with the longest waiting time.

1.2.4 Cond.BroadCast

func (c *Cond) Broadcast() {
    
    
    c.checker.check()
    runtime_notifyListNotifyAll(&c.notify)
}

• Check whether Cond has been copied after first use, if so, panic;

• Take the wait value and assign it to notify;

• Wake up all nodes in the blocking list.

1.3 sync.Pool
sync.Pool is a concurrent and safe object pool under the golang standard library. It is suitable for scenarios where a large number of object resources will be repeatedly constructed and recycled. It can cache resources for reuse to improve performance and reduce GC pressure.

1.3.1 Brief description of gmp principle

g:goroutine;

m: Analog kernel thread;

p: caller, usually the number of p is equal to the number of cpu cores.

• p is the hub, and m dispatches g by combining with p;

• p has a local g queue and a global g queue. The former is set to g without locking, while the latter is locked;

• In preemptive scheduling, g may return to the waiting queue due to blocking or time slice exhaustion, and may eventually be executed by different g and m before and after.

Insert image description here
1.3.2 Data structure

type Pool struct {
    
    
    noCopy noCopy

    local     unsafe.Pointer // local fixed-size per-P pool, actual type is [P]poolLocal
    localSize uintptr        // size of the local array

    victim     unsafe.Pointer // local from previous cycle
    victimSize uintptr        // size of victims array

    // New optionally specifies a function to generate
    // a value when Get would otherwise return nil.
    // It may not be changed concurrently with calls to Get.
    New func() 

• noCopy anti-copy flag;

• The local type is an array of [P]poolLocal, and the array capacity P is the number of goroutine processors P;

• victim is the last round of local temporarily stored after a round of gc recycling;

• New is a user-specified factory function. When the Pool memory is insufficient for elements, this function will be called to construct new elements.

type poolLocal struct {
    
    
    poolLocalInternal
}

// Local per-P Pool appendix.
type poolLocalInternal struct {
    
    
    private any       // Can be used only by the respective P.
    shared  poolChain // Local P can pushHead/popHead; any P can popTail.
}

• poolLocal is the cache data corresponding to a certain P in the Pool;

• poolLocalInternal.private: Corresponds to a private element of a P, no locking is required during operation;

• poolLocalInternal.shared: A shared element linked list under a certain P. Since each P may access it, it needs to be locked.

1.3.3 Core methods

I Pool.pin

func (p *Pool) pin() (*poolLocal, int) {
    
    
    pid := runtime_procPin()
    s := runtime_LoadAcquintptr(&p.localSize) // load-acquire
    l := p.local                              // load-consume
    if uintptr(pid) < s {
    
    
        return indexLocal(l, pid), pid
    }
    return p.pinSlow()
}

• Inside the pin method, the index of the current P is retrieved through the native method runtime_procPin, and the current goroutine is bound to P, making it temporarily non-preemptible;

• If the pin method is called for the first time, it will go into the pinSlow method;

• In the pinSlow method, the initialization of Pool.local will be completed, and the current Pool will be added to the global allPool array for gc recycling;

II Pool.Get

Insert image description here

func (p *Pool) Get() any {
    
    
    l, pid := p.pin()
    x := l.private
    l.private = nil
    if x == nil {
    
    
        x, _ = l.shared.popHead()
        if x == nil {
    
    
            x = p.getSlow(pid)
        }
    }
    runtime_procUnpin()
    if x == nil && p.New != nil {
    
    
        x = p.New()
    }
    return x
}

• Call the Pool.pin method, bind the current goroutine and P, and obtain the cache data corresponding to the P;

• Try to obtain the private element private of P cache data;

• If the previous step fails, try to get the head element of the shared element list in the P cache data;

• If the previous step fails, go to the Pool.getSlow method and try to get the tail element of the shared element list in other P cache data;

• Also in the Pool.getSlow method, if the previous step fails, try to get the element (victim) from the cache before the last round of gc;

• Call the native method to unbind the current goroutine and P

• If all steps (2)-(5) fail to obtain the value, call the user's factory method to construct the element and return.

III Put

// Put adds x to the pool.
func (p *Pool) Put(x any) {
    
    
    if x == nil {
    
    
        return
    }
    l, _ := p.pin()
    if l.private == nil {
    
    
        l.private = x
    } else {
    
    
        l.shared.pushHead(x)
    }
    runtime_procUnpin()
}

• Determine whether the stored element x is not empty;

• Call Pool.pin to bind the current goroutine and P, and obtain the cache data of P;

• If the private element in P cache data is empty, set x to its private element;

• If branch (3) is not entered, add x to the end of the P cache data sharing list;

• Unbind the current goroutine and P.

1.3.4 Recycling mechanism

Objects stored in the pool will be recycled by the go runtime from time to time. Therefore, the pool has no concept of capacity. Even if a large number of elements are stored, memory leaks will not occur.

The specific recycling timing is executed during gc:

func init() {
    
    
    runtime_registerPoolCleanup(poolCleanup)
}

func poolCleanup() {
    
    
    for _, p := range oldPools {
    
    
        p.victim = nil
        p.victimSize = 0
    }

    for _, p := range allPools {
    
    
        p.victim = p.local
        p.victimSize = p.localSize
        p.local = nil
        p.localSize = 0
    }

    oldPools, allPools = allPools, nil
}

• When each Pool executes the Get method for the first time, the pool will be added to the relocated allPools array within the first internal call to the pinSlow method;

• Each time gc is performed, the oldPools of the previous round will be cleared, and the elements of allPools of this round will be assigned to oldPools, and allPools will be empty;

• Newly placed elements in oldPools will uniformly transfer local to victim and set local to empty.

In summary, it can be seen that in at most two rounds of gc, all object resources in the pool will be recycled.


2 Ants

2.1 Basic information
ant source code: https://github.com/panjf2000/ants

2.2 Why use coroutine pool?
• Improving performance: Mainly oriented to one type of scenario: large batches of lightweight concurrent tasks, where the task execution cost is close to the order of magnitude of the coroutine creation/destruction cost;

• There is a concept of concurrency resource control: R&D can clarify the global concurrency of the system and the upper limit of concurrency of each module;

• Coroutine life cycle control: View the current number of global concurrent coroutines in real time; there is a unified emergency entrance to release global coroutines.

2.3 Core data structure

Insert image description here
2.3.1 goWorker

type goWorker struct {
    
    
    pool *Pool
    task chan func()
    recycleTime time.Time
}

goWorker can be simply understood as a long-running non-recycling coroutine, used to repeatedly process asynchronous tasks submitted by users. Its core fields include:

• pool: the coroutine pool to which goWorker belongs;

• task: goWorker’s pipeline for receiving asynchronous task packages;

• recycleTime: the time for goWorker to recycle to the coroutine pool.

2.3.2 Pool

type Pool struct {
    
    
    capacity int32
    running int32
    lock sync.Locker
    workers workerArray
    state int32
    cond *sync.Cond
    workerCache sync.Pool
    waiting int32
    heartbeatDone int32
    stopHeartbeat context.CancelFunc
    options *Options
}

Pool is the so-called coroutine pool, and its member fields are as follows:

• capacity: the capacity of the pool

• running: based on the number of running coroutines

• lock: self-made spin lock to ensure concurrency safety when fetching goWorker

• workers: goWorker list, the “real coroutine pool”

• state: pool status identifier, 0-open; 1-closed

• cond: concurrent coordinator, used in blocking mode to suspend and wake up coroutines waiting for resources.

• workerCache: an object pool that stores goWorker, used to cache released goworker resources for reuse. The object pool needs to be distinguished from the coroutine pool. The goWorker in the coroutine pool is still alive, and the goWorker that enters the object pool has been destroyed in a strict sense;

• waiting: identifies the number of coroutines in the waiting state;

• heartbeatDone: identifies whether the recycling coroutine is closed;

• stopHeartbeat: controller function used to close the recycling coroutine;

• options: Some customized configurations.

2.3.3 options

type Options struct {
    
    
    DisablePurge bool
    ExpiryDuration time.Duration
    MaxBlockingTasks int
    Nonblocking bool
    PanicHandler func(interface{
    
    })
}

A collection of customized parameters for the coroutine pool, including the following configuration items:

• DisablePurge: whether to allow recycling of idle goWorkers;

• ExpiryDuration: idle goWorker recycling interval; only valid when DisablePurge is false;

• Nonblocking: Whether to set it to non-blocking mode. If so, it will not wait when there is insufficient goWorker and directly return err;

• MaxBlockingTasks: In blocking mode, the maximum number of coroutines waiting for blocking; only valid when Nonblocking is false;

• PanicHandler: processing logic when a panic occurs in a submitted task;

2.3.4 workerArray

type workerArray interface {
    
    
    len() int
    isEmpty() bool
    insert(worker *goWorker) error
    detach() *goWorker
    retrieveExpiry(duration time.Duration) []*goWorker
    reset()
}

• workerArray is an interface whose implementation includes the stack version and the queue version;

• This interface mainly defines several common APIs as data collections, as well as APIs for recycling expired goWorkers.

Only the goWorker list implemented based on the stack data structure is shown here:

type workerStack struct {
    
    
    items  []*goWorker
    expiry []*goWorker
}

func newWorkerStack(size int) *workerStack {
    
    
    return &workerStack{
    
    
        items: make([]*goWorker, 0, size),
    }

• items: stored goWorker list;

• expire: used to temporarily store expired goWorker collections;

The following methods are some of the capabilities provided by workerStack as a stack data structure. The core methods are insert and detach, which are used to insert or remove a goWorker from the stack respectively.

func (wq *workerStack) len() int {
    
    
    return len(wq.items)
}

func (wq *workerStack) isEmpty() bool {
    
    
    return len(wq.items) == 0
}

func (wq *workerStack) insert(worker *goWorker) error {
    
    
    wq.items = append(wq.items, worker)
    return nil
}

func (wq *workerStack) detach() *goWorker {
    
    
    l := wq.len()
    if l == 0 {
    
    
        return nil
    }

    w := wq.items[l-1]
    wq.items[l-1] = nil // avoid memory leaks
    wq.items = wq.items[:l-1]
    return w
}

The following retrieveExpire method obtains the expired goWorker collection from the workerStack; the recycling time of goWorker is related to the order in which it is pushed into the stack, so the binarySearch method can be used to quickly obtain the target collection based on the dichotomy method.

func (wq *workerStack) retrieveExpiry(duration time.Duration) []*goWorker {
    
    
    n := wq.len()
    if n == 0 {
    
    
        return nil
    }

    expiryTime := time.Now().Add(-duration)
    index := wq.binarySearch(0, n-1, expiryTime)

    wq.expiry = wq.expiry[:0]
    if index != -1 {
    
    
        wq.expiry = append(wq.expiry, wq.items[:index+1]...)
        m := copy(wq.items, wq.items[index+1:])
        for i := m; i < n; i++ {
    
    
            wq.items[i] = nil
        }
        wq.items = wq.items[:m]
    }
    return wq.expiry
}

func (wq *workerStack) binarySearch(l, r int, expiryTime time.Time) int {
    
    
    var mid int
    for l <= r {
    
    
        mid = (l + r) / 2
        if expiryTime.Before(wq.items[mid].recycleTime) {
    
    
            r = mid - 1
        } else {
    
    
            l = mid + 1
        }
    }
    return 
}

2.4 Core API
2.4.1 pool constructor method

func NewPool(size int, options ...Option) (*Pool, error) {
    
    
    opts := loadOptions(options...)
    // 读取用户配置,做一些前置校验,默认值赋值等前处理动作...

    p := &Pool{
    
    
        capacity: int32(size),
        lock:     internal.NewSpinLock(),
        options:  opts,
    }
    p.workerCache.New = func() interface{
    
    } {
    
    
        return &goWorker{
    
    
            pool: p,
            task: make(chan func(), workerChanCap),
        }
    }

    p.workers = newWorkerArray(stackType, 0)
    p.cond = sync.NewCond(p.lock)

    var ctx context.Context
    ctx, p.stopHeartbeat = context.WithCancel(context.Background())
    go p.purgePeriodically(ctx)
    return p, nil
}

• Read the configuration parameters passed by the user and do some pre-processing actions for verification and default assignment;

• Construct the Pool data structure;

• Construct the goWorker object pool workerCache and declare the factory function;

• Construct the goWorker list inside the Pool;

• Construct the concurrent coordinator cond of the Pool;

• Start goWorker asynchronously and destroy the coroutine upon expiration.

2.4.2 pool submission task

func (p *Pool) Submit(task func()) error {
    
    
    var w *goWorker
    if w = p.retrieveWorker(); w == nil {
    
    
        return ErrPoolOverload
    }
    w.task <- task
    return nil
}

• Remove an available goWorker from the Pool;

• Add user-submitted task packages to goWorker's channel.

func (p *Pool) retrieveWorker() (w *goWorker) {
    
    
    spawnWorker := func() {
    
    
        w = p.workerCache.Get().(*goWorker)
        w.run()
    }

    p.lock.Lock()

    w = p.workers.detach()
    if w != nil {
    
     
        p.lock.Unlock()
    } else if capacity := p.Cap(); capacity == -1 || capacity > p.Running() {
    
    
        p.lock.Unlock()
        spawnWorker()
    } else {
    
     
        if p.options.Nonblocking {
    
    
            p.lock.Unlock()
            return
        }
    retry:
        if p.options.MaxBlockingTasks != 0 && p.Waiting() >= p.options.MaxBlockingTasks {
    
    
            p.lock.Unlock()
            return
        }
        p.addWaiting(1)
        p.cond.Wait() // block and wait for an available worker
        p.addWaiting(-1)

        var nw int
        if nw = p.Running(); nw == 0 {
    
     // awakened by the scavenger
            p.lock.Unlock()
            spawnWorker()
            return
        }
        if w = p.workers.detach(); w == nil {
    
    
            if nw < p.Cap() {
    
    
                p.lock.Unlock()
                spawnWorker()
                return
            }
            goto retry
        }
        p.lock.Unlock()
    }
    return
}

• Declared a function spawnWorker to construct goWorker for the purpose of checking. Internally, it actually obtains goWorker from the object pool workerCache;

• The next core logic is to lock, and then try to take out the goWorker from the pool to execute the task;

• If the pool capacity exceeds the limit and the pool is in blocking mode, the coroutine will be suspended and blocked based on the concurrent coordinator cond;

• If the pool capacity exceeds the limit and the pool is in non-blocking mode, an error will be thrown directly;

• If the pool capacity does not exceed the limit and no goWorker is obtained, call spawnWorker to construct a new goWorker for task execution.

2.4.3 goWorker running

func (w *goWorker) run() {
    
    
    w.pool.addRunning(1)
    go func() {
    
    
        defer func() {
    
    
            w.pool.addRunning(-1)
            w.pool.workerCache.Put(w)
            if p := recover(); p != nil {
    
    
                // panic 后处理
            }
            w.pool.cond.Signal()
        }()

        for f := range w.task {
    
    
            if f == nil {
    
    
                return
            }
            f()
            if ok := w.pool.revertWorker(w); !ok {
    
    
                return
            }
        }
    }()

• Loop + blocking wait until the asynchronous task package task submitted by the user is obtained and executed;

• After completing the task execution, it will return itself to the coroutine pool;

• If the return to the coroutine pool fails, or the user submits an empty task package, the goWorker will be destroyed by putting itself back into the object pool workerCache of the coroutine pool. And the coordinator cond will be called to wake up a blocking wait coroutine.

2.4.4 Pool recycling coroutine

// revertWorker puts a worker back into free pool, recycling the goroutines.
func (p *Pool) revertWorker(worker *goWorker) bool {
    
    
    worker.recycleTime = time.Now()
    p.lock.Lock()
    err := p.workers.insert(worker)
    if err != nil {
    
    
        p.lock.Unlock()
        return false
    }

    p.cond.Signal()
    p.lock.Unlock()
    return true
}

The Pool.revertWorker method is used to recycle goWorker back to the coroutine pool:

• Update goWorker Xiaohu time during recycling, which is used for regular destruction of goWorker;

• After locking, add goWorker back to the coroutine pool;

• Wake up the next blocking waiting coroutine through the coordinator cond and unlock it.

2.4.5 Periodically recycle expired goWorker

func (p *Pool) purgePeriodically(ctx context.Context) {
    
    
    heartbeat := time.NewTicker(p.options.ExpiryDuration)
    defer func() {
    
    
        heartbeat.Stop()
        atomic.StoreInt32(&p.heartbeatDone, 1)
    }()

    for {
    
    
        select {
    
    
        case <-heartbeat.C:
        case <-ctx.Done():
            return
        }

        if p.IsClosed() {
    
    
            break
        }

        p.lock.Lock()
        expiredWorkers := p.workers.retrieveExpiry(p.options.ExpiryDuration)
        p.lock.Unlock()

        for i := range expiredWorkers {
    
    
            expiredWorkers[i].task <- nil
            expiredWorkers[i] = nil
        }

        if p.Running() == 0 || (p.Waiting() > 0 && p.Free() > 0) {
    
    
            p.cond.Broadcast()
        }
} 

• The purgePeriodically method opens a ticker to poll and destroy expired goWorkers according to the user's preset expiration interval;

• The method of destruction is to inject a null value into the channel corresponding to goWorker, and goWorker will automatically put itself back into the object pool workerCache of the coroutine pool;

• If there is currently an idle goWorker and there are coroutines blocking and waiting, all blocking coroutines will be awakened.

Guess you like

Origin blog.csdn.net/u014374009/article/details/133232094