Golang basics: the use and basic implementation of the underlying concurrency primitives Mutex RWMutex Cond WaitGroup Once

The previous article "Common usage scenarios of native concurrent goroutine channel and select" introduced the concurrency method based on the CSP model.

In addition to CSP, Go also provides some lower-level synchronization APIs through the sync package and the atomic package, which are generally used in scenarios with relatively high performance requirements.

The synchronization mechanism implemented by sync.Mutex is more than three times more performant than that achieved by channels.

insert image description here

In sync/mutex.go, there is such a note:

// Package sync provides basic synchronization primitives such as mutual
// exclusion locks. Other than the Once and WaitGroup types, most are intended
// for use by low-level library routines. Higher-level synchronization is
// better done via channels and communication.
//
// Values containing the types defined in this package should not be copied.

The general meaning of the comment is that the sync package provides low-level concurrency primitives, which are generally used by low-level libraries. If it is upper-level business synchronization, it is best to use channels.

Another point, in daily coding, don't use copied concurrent package objects. First, the copied object is not the same content as the original, and the state is inconsistent; secondly, if the copied lock object is already locked, the copy may also be locked, which exceeds expectations.

If you need to use it in multiple places, you can use global variables or pointer transfer (& to create, * to use).

Only the Goroutine that owns the data object (received the data from the channel) can make state changes to the data object.

Mutex

The functions of Mutex and RMMutex are similar to those in Java. Let's mainly look at the API and basic implementation.

use:

    mu := sync.Mutex{}
    mu.Lock()       //加互斥锁
    mu.Unlock()

accomplish:

// A Mutex is a mutual exclusion lock.
// The zero value for a Mutex is an unlocked mutex.
//
// A Mutex must not be copied after first use.
type Mutex struct {
    state int32
    sema  uint32
}

The implementation is simple, one is state and the other is semaphore.

Problems with copying using Mutex

Let's take a look at the problem of copying and using sync.Mutex through the code.

var num int = 1
func testCopyMutex() {
    mu := sync.Mutex{}
    waitGroup := sync.WaitGroup{}

    waitGroup.Add(1)

    go func(copyMu sync.Mutex) {
        copyMu.Lock()
        num = 100
        fmt.Println("update num from sub-goroutine: ", num)
        time.Sleep(5 * time.Second)
        fmt.Println("read num from sub-goroutine: ", num)
        copyMu.Unlock()
        waitGroup.Done()
    }(mu)

    time.Sleep(time.Second)

    mu.Lock()
    num = 1
    fmt.Println("read num from main: ", num)
    mu.Unlock()

    waitGroup.Wait()
}

In the above code, we first lock in the child goroutine, and then modify num. After waiting for 5s, output the value of num, and then release the lock; during this period, the main goroutine will try to acquire the lock, and then modify num.

What we expect is the same Mutex passed. Before the lock is released in the sub-goroutine, num is its modified value, but the allowed result is surprising:

update num from sub-goroutine:  100
read num from main:  1
read num from sub-goroutine:  1

It can be seen that during the actual operation, the main goroutine can also access num during the time when the child goroutine is locked.

The reason for the problem is that the value of Mutex is passed.

insert image description here

If you pass the pointer instead, the result is as expected:

    go func(copyMu *sync.Mutex) {   //参数类型加 *
        copyMu.Lock()
        num = 100
        fmt.Println("update num from sub-goroutine: ", num)
        time.Sleep(5 * time.Second)
        fmt.Println("read num from sub-goroutine: ", num)
        copyMu.Unlock()
        waitGroup.Done()
    }(&mu)  //传递指针

Why is passing the value problematic?

The reason is that after the Mutex is copied, the state is separated. The child goroutine locks the copy, and the main goroutine cannot perceive it, because they are not using the same data!

// Lock locks m.
// If the lock is already in use, the calling goroutine
// blocks until the mutex is available.
func (m *Mutex) Lock() {
    // Fast path: grab unlocked mutex.
    if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
        if race.Enabled {
            race.Acquire(unsafe.Pointer(m))
        }
        return
    }
    // Slow path (outlined so that the fast path can be inlined)
    m.lockSlow()
}

Read-write lock RWMutex

use:

    rmu := sync.RWMutex{}
    rmu.RLock()     //读锁
    rmu.RUnlock()
    rmu.Lock()      //写锁
    rmu.Unlock()

    rmu.RLocker().Lock()        //通过 RLock 实现
    rmu.RLocker().Unlock()

accomplish:

// A RWMutex is a reader/writer mutual exclusion lock.
// The lock can be held by an arbitrary number of readers or a single writer.
// The zero value for a RWMutex is an unlocked mutex.
//
// A RWMutex must not be copied after first use.
//
// If a goroutine holds a RWMutex for reading and another goroutine might
// call Lock, no goroutine should expect to be able to acquire a read lock
// until the initial read lock is released. In particular, this prohibits
// recursive read locking. This is to ensure that the lock eventually becomes
// available; a blocked Lock call excludes new readers from acquiring the
// lock.
type RWMutex struct {
    w           Mutex  // held if there are pending writers
    writerSem   uint32 // semaphore for writers to wait for completing readers
    readerSem   uint32 // semaphore for readers to wait for completing writers
    readerCount int32  // number of pending readers
    readerWait  int32  // number of departing readers
}

It can be seen that the members of RWMutex have a mutex (used to acquire when writing), a semaphore for readers and writers, the number of readers, and so on.

RWMutext reads do not block reads, but block writes.

// Lock locks rw for writing.
// If the lock is already locked for reading or writing,
// Lock blocks until the lock is available.
func (rw *RWMutex) Lock() {
    if race.Enabled {
        _ = rw.w.state
        race.Disable()
    }
    // First, resolve competition with other writers.
    rw.w.Lock()
    // Announce to readers there is a pending writer.
    r := atomic.AddInt32(&rw.readerCount, -rwmutexMaxReaders) + rwmutexMaxReaders
    // Wait for active readers.
    if r != 0 && atomic.AddInt32(&rw.readerWait, r) != 0 {
        runtime_SemacquireMutex(&rw.writerSem, false, 0)
    }
    if race.Enabled {
        race.Enable()
        race.Acquire(unsafe.Pointer(&rw.readerSem))
        race.Acquire(unsafe.Pointer(&rw.writerSem))
    }
}

When calling the write lock, it is to acquire the mutex in it.

One more nonsense: Read-write locks are suitable for situations where the concurrency level is relatively large and the number of reads is greater than that of writes.

Notes on mutexes and read-write locks:

  • Reduce the scope of the lock
  • Always remember to unlock, you can write defer unlock earlier to avoid forgetting

Condition variableCond

sync.Cond, used in scenarios that need to "wait for a certain condition to be established", it is rarely used.

Multiple goroutines are supported to wait for a certain condition, and when the condition permits, the broadcast wakes up these goroutines for execution.

Compared with the mutually exclusive state of rotation training, condition variables consume less resources and are simpler to implement.

use:

    cond := sync.NewCond(&sync.Mutex{}) 参数为 sync.Locker 接口类型

    go func() {
        //    cond.L.Lock()
        //    for !condition() {
               cond.Wait()      //等待,一般放在循环里,查询一次,不满足就阻塞(释放锁),等被唤醒后,再检查下条件
        //    }
        //    ... make use of condition ...
        //    cond.L.Unlock()
    }()

    cond.L.Lock()   //获取构造传入的锁
    cond.Broadcast()    //通知所有等待的 goroutine,从 Wait 返回,重新获取锁
    cond.Signal()   //通知一个
    cond.L.Unlock()

sync.Cond.Wait is generally used in conjunction with a for loop to repeatedly check whether the condition is met.

accomplish:

// Cond implements a condition variable, a rendezvous point
// for goroutines waiting for or announcing the occurrence
// of an event.
//
// Each Cond has an associated Locker L (often a *Mutex or *RWMutex),
// which must be held when changing the condition and
// when calling the Wait method.
//
// A Cond must not be copied after first use.
type Cond struct {
    noCopy noCopy

    // L is held while observing or changing the condition
    L Locker

    notify  notifyList
    checker copyChecker
}

// NewCond returns a new Cond with Locker l.
func NewCond(l Locker) *Cond {
    return &Cond{L: l}
}

As you can see, the condition mainly consists of a lock and a queue waiting to be woken up.

func (c *Cond) Wait() {
    c.checker.check()
    t := runtime_notifyListAdd(&c.notify)
    c.L.Unlock()
    runtime_notifyListWait(&c.notify, t)
    c.L.Lock()
}

When waiting for a condition, it will first join the waiting queue, and then release the lock of this condition.

func (c *Cond) Broadcast() {
    c.checker.check()
    runtime_notifyListNotifyAll(&c.notify)
}

When broadcasting, all waiting goroutines will be notified to resume executing the logic in Wait and re-apply for the lock.

Wait group WaitGroup

In scenarios that need to wait for multiple goroutines to complete tasks, you can use sync.WaitGroup, which is similar to Java's CountDownLaunch.

use:

    waitGroup := sync.WaitGroup{}
    waitGroup.Add(1)        //需要等待数为 1
    go func() {
        waitGroup.Done()    //减去需要等待数
    }()

    waitGroup.Wait()    //等待数为 0 才继续执行,循环检查 

accomplish:

type WaitGroup struct {
    noCopy noCopy

    // 64-bit value: high 32 bits are counter, low 32 bits are waiter count.
    // 64-bit atomic operations require 64-bit alignment, but 32-bit
    // compilers do not ensure it. So we allocate 12 bytes and then use
    // the aligned 8 bytes in them as state, and the other 4 as storage
    // for the sema.
    state1 [3]uint32
}

It can be seen that the core of WaitGroup is a counting state, the high-order 32 bits are the quantity, and the low-order 32 bits are the waiting quantity.

// Wait blocks until the WaitGroup counter is zero.
func (wg *WaitGroup) Wait() {
    statep, semap := wg.state()
    if race.Enabled {
        _ = *statep // trigger nil deref early
        race.Disable()
    }
    for {
        state := atomic.LoadUint64(statep)
        v := int32(state >> 32)
        w := uint32(state)
        if v == 0 {
            // Counter is 0, no need to wait.
            if race.Enabled {
                race.Enable()
                race.Acquire(unsafe.Pointer(wg))
            }
            return
        }
        // Increment waiters count.
        //...
    }
}

When calling waitGroup.Wait(), it will check the state in a loop, and return only when the high 32 bits are 0 (that is, the currently executing task is 0), otherwise it will increase the low bit and continue the loop.

It is conceivable that calling Done is the high bit minus 1, so I won't go into details for now.

Execute only onceOnce

Literally, sync.Once is used to ensure that the incoming function is executed only once.

In some high-concurrency scenarios, there may be such a requirement: multiple goroutines execute task A at the same time, whichever runs first will execute task B, and those that run slowly do not need to execute.

use:

    once := sync.Once{}
    once.Do(func() {
        fmt.Println("do the work that only need exec once")
    })

The implementation is also very simple:

type Once struct {
    // done indicates whether the action has been performed.
    // It is first in the struct because it is used in the hot path.
    // The hot path is inlined at every call site.
    // Placing done first allows more compact instructions on some architectures (amd64/386),
    // and fewer instructions (to calculate offset) on other architectures.
    done uint32
    m    Mutex
}

The implementation of Once is a state value and a mutex.

func (o *Once) Do(f func()) {
    // Note: Here is an incorrect implementation of Do:
    //
    //  if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
    //      f()
    //  }
    //
    // Do guarantees that when it returns, f has finished.
    // This implementation would not implement that guarantee:
    // given two simultaneous calls, the winner of the cas would
    // call f, and the second would return immediately, without
    // waiting for the first's call to f to complete.
    // This is why the slow path falls back to a mutex, and why
    // the atomic.StoreUint32 must be delayed until after f returns.

    if atomic.LoadUint32(&o.done) == 0 {
        // Outlined slow-path to allow inlining of the fast-path.
        o.doSlow(f)
    }
}
func (o *Once) doSlow(f func()) {
    o.m.Lock()
    defer o.m.Unlock()
    if o.done == 0 {
        defer atomic.StoreUint32(&o.done, 1)
        f()
    }
}

When executed for the first time, the done state will be modified through atomic operations (this process needs to acquire a mutex). Execute Do later, and if the state is found to be wrong, it will not be executed.

atomic operation

When looking at some implementations of concurrent package delivery, I found that they are more or less implemented using atomic, such as WaitGroup#Wait:

        state := atomic.LoadUint64(statep)
        v := int32(state >> 32)
        w := uint32(state)

atomic atomic operation, which can only synchronize an integer variable or a custom type variable, is more suitable for occasions that are very sensitive to performance, have a large amount of concurrency, and read more and write less.

Atomic operations are directly supported by the underlying hardware and are hardware-implemented instruction-level "transactions"
. The characteristics of atomic atomic operations: With the increase in concurrency, the concurrent read and write performance of shared variables implemented using atomic is more stable, especially It is an atomic read operation. Compared with the read-write lock primitive in the sync package, atomic shows better scalability and high performance

Regardless of integer variables and custom type variables, atomic operations are essentially aimed at word-length pointers. On a 64-bit CPU it is 8 bytes. Because the CPU passes through the data bus, it can only obtain information of one word length from the memory at a time. So the atomic limit is also a word length.

other

Although they are all in the sync package, sync.WaitGroup, Map, and Pool have a higher level and are based on the three basic primitives of Mutex, RWMutex, and Cond.

The Go team considers recursive locks or reentrant locks a bad syntax, so they don't support them.

Guess you like

Origin blog.csdn.net/u011240877/article/details/124310933