Map concurrent read and write problems and solutions in golang

1. Map concurrent read and write problems

If the map is read and written by multiple coroutines at the same time, a fatal error: concurrent map read and map write error will occur

The following code is prone to map concurrent read and write problems

func main(){

c := make(map[string]int)
       go func() {//开一个协程写map
            for j := 0; j < 1000000; j++ {
              c[fmt.Sprintf("%d", j)] = j
            }
       }()
       go func() {    //开一个协程读map
             for j := 0; j < 1000000; j++ {
                 fmt.Println(c[fmt.Sprintf("%d",j)])
             }
       }()

 time.Sleep(time.Second*20)

}

Multiple coroutines writing at the same time will also have fatal error: concurrent map writes error

The following code is prone to the problem of concurrent map writing

package main

import (
	"fmt"
	"time"
)

func main() {
	c := make(map[string]int)
	for i := 0; i < 100; i++ {
		go func() { //开100个协程并发写map
			for j := 0; j < 1000000; j++ {
				c[fmt.Sprintf("%d", j)] = j
			}
		}()
	}
	time.Sleep(time.Second * 20) //让执行main函数的主协成等待20s,不然不会执行上面的并发操作
}

Second, the cause of the problem

Because map is a reference type, even if the function is called by value, the parameter copy still points to the map m, so multiple goroutines write the same map m concurrently. Students who write multi-threaded programs know that for shared variables, resources, concurrent read and write There will be competition, so shared resources will be destroyed

3. Solutions

1. Lock

(1) Universal lock

type Demo struct {
  Data map[string]string 
  Lock sync.Mutex
}

func (d Demo) Get(k string) string{
  d.Lock.Lock()
  defer d.Lock.UnLock()

  return d.Data[k]
}


func (d Demo) Set(k,v string) {

  d.Lock.Lock()

  defer d.Lock.UnLock()

  d.Data[k]=v
}

(2) Read-write lock

type Demo struct {
  Data map[string]string 
  Lock sync.RwMutex
}

func (d Demo) Get(k string) string{
  d.Lock.RLock()

  defer d.Lock.RUnlock()

  return d.Data[k]
}

func (d Demo) Set(k,v string) {

  d.Lock.Lock()

  defer d.Lock.UnLock()

  d.Data[k]=v
}

2. Use channel serialization processing

 

[go language] Tucao: How to implement a data collection that supports concurrent access better?

In the go language, channel communication is advocated instead of explicit synchronization mechanisms. But I found that sometimes it seems that the channel communication method is not very good (not considering the efficiency issue for the time being).

Suppose there is a collection of accounts, and some operations need to be implemented on this collection, such as search and modification. Operations on this collection must support concurrency.

If you use the lock method (scheme 1), it is probably like this:

import "sync"

type Info struct {
	age int
}
type AccountMap struct {
	accounts map[string]*Info
	mutex    sync.Mutex
}

func NewAccountMap() *AccountMap {
	return &AccountMap{
		accounts: make(map[string]*Info),
	}
}
func (p *AccountMap) add(name string, age int) {
	p.mutex.Lock()
	defer p.mutex.Unlock()
	p.accounts[name] = &Info{age}
}
func (p *AccountMap) del(name string) {
	p.mutex.Lock()
	defer p.mutex.Unlock()
	delete(p.accounts, name)
}
func (p *AccountMap) find(name string) *Info {
	p.mutex.Lock()
	defer p.mutex.Unlock()
	res, ok := p.accounts[name]
	if !ok {
		return nil
	}
	inf := *res
	return &inf
}

 

Try it with channels (scenario 2):

type Info struct {
	age int
}
type AccountMap struct {
	accounts map[string]*Info
	ch chan func()
}
func NewAccountMap() *AccountMap {
	p := &AccountMap{
		accounts: make(map[string]*Info),
		ch: make(chan func()),
	}
	go func() {
		for {(<-p.ch)()}
	}()
	return p
}
func (p *AccountMap) add(name string, age int) {
	p.ch <- func() {
		p.accounts[name] = &Info{age}
	}
}
func (p *AccountMap) del(name string) {
	p.ch <- func() {
		delete(p.accounts, name)
	}
}
func (p *AccountMap) find(name string) *Info {
	// 每次查询都要创建一个信道
	c := make(chan *Info)
	p.ch <- func() {
		res, ok := p.accounts[name]
		if !ok {
			c <- nil
		} else {
			inf := *res
			c <- &inf
		}
	}
	return <-c
}
There is a problem here, every time find is called, a channel is created.
Then try to use the channel as a parameter (scenario 3), just modify the implementation of the find function:
// 信道对象作为参数,暴露了实现机制
func (p *AccountMap) find(name string, c chan *Info) *Info {
	p.ch <- func() {
		res, ok := p.accounts[name]
		if !ok {
			c <- nil
		} else {
			inf := *res
			c <- &inf
		}
	}
	return <-c
}

 

To sum up, the problem now is that all three solutions are not satisfactory:
Option 1: Use the lock mechanism, which is not in line with the way go solves the problem.
Option 2: For queries that need to return results, a channel must be created for each query, which is a waste of resources.
Option 3: The channel object needs to be specified in the function parameter to expose the implementation mechanism.
So is there a better solution?
 
2012.12.14:
There is also an improved version of Scheme 2: using pre-allocated and recyclable channels to improve resource utilization. This technique is useful when multiple goroutines are waiting for an active object to return their data. For example, the connection of each player in the login server in the online game server is handled by a goroutine; another active object represents the connection of the account server to verify the legitimacy of the account. The player goroutines will send their respective entered player account passwords to the active object, and block waiting for the active object to return the verification result. Because multiple players initiate account verification requests at the same time, the active object needs to distribute the returned results, so it can apply for a channel and wait for this channel when sending the request.
code show as below:

type Info struct {
	age int
}
type AccountMap struct {
	accounts map[string]*Info
	ch chan func()
	tokens chan chan *Info
}
func NewAccountMap() *AccountMap {
	p := &AccountMap{
		accounts: make(map[string]*Info),
		ch: make(chan func()),
		tokens: make(chan chan *Info, 128),
	}
	for i := 0; i < cap(p.tokens); i++ {
		p.tokens <- make(chan *Info)
	}
	go func() {
		for {(<-p.ch)()}
	}()
	return p
}
func (p *AccountMap) add(name string, age int) {
	p.ch <- func() {
		p.accounts[name] = &Info{age}
	}
}
func (p *AccountMap) del(name string) {
	p.ch <- func() {
		delete(p.accounts, name)
	}
}
func (p *AccountMap) find(name string) *Info {
	// 每次查询都要获取一个信道
	c := <-p.tokens
	p.ch <- func() {
		res, ok := p.accounts[name]
		if !ok {
			c <- nil
		} else {
			inf := *res
			c <- &inf
		}
	}
	inf := <-c
	// 回收信道
	p.tokens <- c
	return inf
}

 

To add to the comments on golang-china :
xushiwei
In your approach, use channel to actually serialize all requests.
In addition, in terms of cost, channels are much larger than locks. Because the channel itself is obviously implemented with a lock + signal wake-up mechanism.
 
steve wang
Can it be summed up like this:
1. Use locks to control concurrent access to data objects shared with each goroutine
2. For communication between goroutines, use channels
 
longshanksmo
 
In terms of performance alone, it is a bit hasty to draw such a conclusion. Concurrency and performance issues are complex, and different scenarios may lead to completely opposite conclusions.
There are many other factors to consider:
First, in different use cases, the lock granularity is different. In your case is a map operation, the lock granularity is small. But if it is some kind of overloaded operation, or there is blocking, the lock granularity will be very large. Locks were not worth it then.
Second, the lock granularity of chan is small, basically fixed and predictable. In actual business, predictable performance is very important, which determines the resource investment and allocation during deployment.
Most importantly, if all goroutines in the process are running in a single thread, then the lock on chan is not needed. Only in this way can the advantages of coroutines be truly exploited. It seems that the current go compiler has not optimized this, and I wonder if it will evolve in the future.
In short, the concurrency aspect has not changed

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326939403&siteId=291194637