The essence of GO language interview - what is the traversal process of map?

Originally, the map traversal process was relatively simple: traverse all buckets and the overflow buckets behind them, and then traverse all cells in the bucket one by one. Each bucket contains 8 cells. The key and value are taken out from the cell with the key, and the process is completed.

However, reality is not that simple. Remember the expansion process mentioned earlier? The expansion process is not an atomic operation. It only moves up to 2 buckets at a time. Therefore, if the expansion operation is triggered, the state of the map will be in an intermediate state for a long time: some buckets have been moved to new homes, and some buckets have been moved to new homes. Some buckets are still in the same place.

Therefore, if the traversal occurs during the expansion process, it will involve the process of traversing the old and new buckets, which is the difficulty.

Let me first write a simple code sample, pretending not to know what function is specifically called during the traversal process:

package main

import "fmt"

func main() {
	ageMp := make(map[string]int)
	ageMp["qcrao"] = 18

	for name, age := range ageMp {
		fmt.Println(name, age)
	}
}

Excuting an order:

go tool compile -S main.go

Get the assembly command. I won’t explain it line by line here. You can read the previous articles, which explain it in detail.

The key lines of assembly code are as follows:

// ......
0x0124 00292 (test16.go:9)      CALL    runtime.mapiterinit(SB)

// ......
0x01fb 00507 (test16.go:9)      CALL    runtime.mapiternext(SB)
0x0200 00512 (test16.go:9)      MOVQ    ""..autotmp_4+160(SP), AX
0x0208 00520 (test16.go:9)      TESTQ   AX, AX
0x020b 00523 (test16.go:9)      JNE     302

// ......

In this way, regarding map iteration, the underlying function calling relationship is clear at a glance. First, mapiterinitthe function is called to initialize the iterator, and then mapiternextthe function is called in a loop to iterate the map.

Insert image description here

Iterator structure definition:

type hiter struct {
	// key 指针
	key         unsafe.Pointer
	// value 指针
	value       unsafe.Pointer
	// map 类型，包含如 key size 大小等
	t           *maptype
	// map header
	h           *hmap
	// 初始化时指向的 bucket
	buckets     unsafe.Pointer
	// 当前遍历到的 bmap
	bptr        *bmap
	overflow    [2]*[]*bmap
	// 起始遍历的 bucket 编号
	startBucket uintptr
	// 遍历开始时 cell 的编号（每个 bucket 中有 8 个 cell）
	offset      uint8
	// 是否从头遍历了
	wrapped     bool
	// B 的大小
	B           uint8
	// 指示当前 cell 序号
	i           uint8
	// 指向当前的 bucket
	bucket      uintptr
	// 因为扩容，需要检查的 bucket
	checkBucket uintptr
}

mapiterinitIt is to initialize and assign values to the fields in the hiter structure.

As mentioned before, even if a hard-coded map is traversed, the results will be out of order every time. Below we can take a closer look at their implementation.

// 生成随机数 r
r := uintptr(fastrand())
if h.B > 31-bucketCntBits {
	r += uintptr(fastrand()) << 31
}

// 从哪个 bucket 开始遍历
it.startBucket = r & (uintptr(1)<<h.B - 1)
// 从 bucket 的哪个 cell 开始遍历
it.offset = uint8(r >> h.B & (bucketCnt - 1))

For example, B = 2, then uintptr(1)<<h.B - 1the result is 3, and the lower 8 bits are 0000 0011. By ANDing r with it, you can get a 0~3bucket number; bucketCnt - 1 is equal to 7, and the lower 8 bits are 0000 0111. After shifting r to the right by 2 bits, Anded with 7, you can get a 0~7cell with number.

Therefore, in mapiternextthe function, the traversal will start from the cell with the it.offset number of it.startBucket, and the key and value will be taken out until it returns to the starting bucket to complete the traversal process.

The source code part is relatively easy to understand, especially after understanding the previously commented sections of code, there is no pressure to read this part of the code. So, next, I will explain the entire traversal process graphically, hoping it will be clear and easy to understand.

Suppose we have a map as shown in the figure below. Initially, B = 1 and there are two buckets. Later, expansion is triggered (don’t go into the expansion conditions here, it is just a setting), and B becomes 2. Moreover, the content in bucket No. 1 has been moved to the new bucket, 1 号splitting into 1 号sum 3 号; 0 号the bucket has not been moved yet. The old bucket hangs on *oldbucketsthe pointer, and the new bucket hangs on *bucketsthe pointer.

Insert image description here

At this time, we traverse this map. Assume that after initialization, startBucket = 3, offset = 2. Therefore, the starting point of the traversal will be cell No. 2 of bucket No. 3. The following picture is the state when the traversal starts:

Insert image description here

The one marked in red indicates the starting position, and the bucket traversal order is: 3 -> 0 -> 1 -> 2.

Because bucket No. 3 corresponds to the old bucket No. 1, first check whether the old bucket No. 1 has been relocated. The judgment method is:

func evacuated(b *bmap) bool {
	h := b.tophash[0]
	return h > empty && h < minTopHash
}

If the value of b.tophash[0] is within the flag value range, that is, in the (0,4) interval, it means that it has been relocated.

empty = 0
evacuatedEmpty = 1
evacuatedX = 2
evacuatedY = 3
minTopHash = 4

In this example, the old bucket No. 1 has been moved. So its tophash[0] value is in the range of (0,4), so only the new bucket No. 3 needs to be traversed.

Traverse the cells of bucket No. 3 in sequence, and you will find the first non-empty key: element e. At this point, the mapiternext function returns, and our traversal result only has one element:

Insert image description here

Since the returned key is not empty, the mapiternext function will continue to be called.

Continue to traverse from the last traversed place, and find element f and element g from the new overflow bucket No. 3.

Traversing the result set also grows:

Insert image description here

After traversing the new bucket No. 3, it returns to the new bucket No. 0. Bucket No. 0 corresponds to the old bucket No. 0. After checking, the old bucket No. 0 has not been relocated, so the traversal of the new bucket No. 0 is changed to the old bucket No. 0. Does that mean taking out all the keys in the old bucket No. 0?

It's not that simple. Recall that the old bucket No. 0 will split into two buckets after the relocation: new No. 0 and new No. 2. What we are traversing at this time is only the new bucket No. 0 (note that traversals are all traversal *bucketpointers, which are the so-called new buckets). Therefore, we will only take out those keys in the old bucket No. 0 that are allocated to the new bucket No. 0 after fission.

Therefore, lowbits == 00the result set will be traversed:

Insert image description here

Like the previous process, continue to traverse the new bucket No. 1 and find that the old bucket No. 1 has been moved. You only need to traverse the existing elements in the new bucket No. 1. The result set becomes:

Insert image description here

Continue to traverse the new bucket No. 2, which comes from the old bucket No. 0, so we need the keys in the old bucket No. 0 that will fission into the new bucket No. 2, that is, the keys of lowbit == 10.

In this way, traversing the result set becomes:

Insert image description here

Finally, when the traversal continues to the new bucket No. 3, it is found that all buckets have been traversed and the entire iteration process is completed.

By the way, if you encounter a key math.NaN()like this, the processing method is similar. The core still depends on which bucket it falls into after being split. Just look at the lowest bit of its top hash. If the lowest bit of the top hash is 0, it is assigned to the X part; if it is 1, it is assigned to the Y part. Based on this, decide whether to remove the key and put it in the traversal result set.

The core of map traversal is to understand that when the capacity is doubled, the old bucket will be split into two new buckets. The traversal operation will be performed in the order of the new bucket's serial number. When the old bucket is not moved, the key to be moved to the new bucket in the future must be found in the old bucket.

The essence of GO language interview - what is the traversal process of map?

Guess you like