Talk about the range keyword in Golang

Talk about the range keyword in Golang

[TOC]

First let's look at two pieces of code

  1. Can the following program end normally?
 func main() {
	v := []int{1, 2, 3}
	for i := range v {
		v = append(v, i)
	}
}
  1. What does the following program output?
func IndexArray() {
	a := [...]int{1, 2, 3, 4, 5, 6, 7, 8}

	for i := range a {
		a[3] = 100
		if i == 3 {
			fmt.Println("IndexArray", i, a[i])
		}
	}
}

func IndexValueArray() {
	a := [...]int{1, 2, 3, 4, 5, 6, 7, 8}

	for i, v := range a {
		a[3] = 100
		if i == 3 {
			fmt.Println("IndexValueArray", i, v)
		}
	}
}

func IndexValueArrayPtr() {
	a := [...]int{1, 2, 3, 4, 5, 6, 7, 8}

	for i, v := range &a {
		a[3] = 100
		if i == 3 {
			fmt.Println("IndexValueArrayPtr", i, v)
		}
	}
}

func main() {
	IndexArray()
	IndexValueArray()
	IndexValueArrayPtr()
}

First let's read the official documentation

range variable

We should all know that the loop variable on the left side of range can be assigned in the following ways:

Equals sign direct assignment ( = ) Short variable declaration assignment ( := ) Of course, you can also write nothing to completely ignore the values ​​iteratively traversed.

If a short variable declaration (:=) is used, Go will reuse the declared variable on each iteration of the loop (only valid within the scope of the loop). The left-hand side of the expression must be either an addressable or a map-indexed expression, if The expression is a channel, allowing at most one variable, otherwise two variables.

range expression

The result of the expression on the right side of range can be of the following data types:

  • array
  • pointer to an array
  • slice
  • string
  • map
  • channel permitting receive operations 比如:chan int or chan<- int

The range expression will be evaluated once before starting the loop. But there is one exception:

If you do range on an array or a pointer to an array and there is at most one variable (only the array index is used): only the expression length is evaluated.

What exactly does evaluated here mean? Unfortunately, I can't find the relevant instructions in the documentation. Of course I guess it is actually executing the expression completely until it can no longer be disassembled. In any case, the most important thing is that the range expression is fully executed once before the entire iteration starts. So how would you make an expression execute only once? Put the execution result in a variable! Will the processing of range expressions do the same?

It's interesting that the spec document mentions some cases of adding or removing maps (not slices).

If an element in the map is removed before it has been traversed, the element will not appear in subsequent iterations. And if the element in the map is added during the iteration, then the element may appear or be skipped in subsequent iterations.

The second step is to study the range copy

If we assume that the range expression is copied to a variable before the loop starts, what do we need to care about? The answer is the data type of the expression result, let's take a closer look at the data types supported by range.

Before we get started, remember: in Go, whatever we assign to is copied. If a pointer is assigned, then we make a copy of the pointer. If a struct is assigned, then we make a copy of the struct. The same is true for passing parameters into functions. Alright, let's get started:

Range expression                          1st value          2nd value

array or slice  a  [n]E, *[n]E, or []E    index    i  int    a[i]       E
string          s  string type            index    i  int    see below  rune
map             m  map[K]V                key      k  K      m[k]       V
channel         c  chan E, <-chan E       element  e  E

However these don't seem to do much to really solve our problem! Well, let's look at a piece of code first:

func main() {
	// 复制整个数组
	var a [10]int
	acopy := a
	a[0] = 10
	fmt.Println("a", a)
	fmt.Println("acopy", acopy)
	// 只复制了 slice 的结构体,并没有复制成员指针指向的数组
	s := make([]int, 10)
	s[0] = 10
	scopy := s
	fmt.Println("s", s)
	fmt.Println("scopy", scopy)
	// 只复制了 map 的指针
	m := make(map[string]int)
	mcopy := m
	m["0"] = 10
	fmt.Println("m", m)
	fmt.Println("mcopy", mcopy)
}

Guess what the output of this program is, don't sell it, and go directly to the answer.

a [10 0 0 0 0 0 0 0 0 0]
acopy [0 0 0 0 0 0 0 0 0 0]
s [10 0 0 0 0 0 0 0 0 0]
scopy [10 0 0 0 0 0 0 0 0 0]
m map[0:10]
mcopy map[0:10]

So, if you want to assign an array expression to a variable before the range loop starts (to ensure that the expression is evaluated only once), the entire array is copied.

The third step is the truth in the source code

Looking at the gcc source code , we found that the range-related part we care about appears in statements.cc, and the following is a comment:

  // Arrange to do a loop appropriate for the type.  We will produce
  //   for INIT ; COND ; POST {
  //           ITER_INIT
  //           INDEX = INDEX_TEMP
  //           VALUE = VALUE_TEMP // If there is a value
  //           original statements
  //   }

Now it's finally a bit of a brow. The internal implementation of the range loop is actually syntactic sugar for the C-style loop, which is unexpected and reasonable. The compiler will do special "syntactic sugar reduction" for each range supported type. for example,

array:

  // The loop we generate:
  //   len_temp := len(range)
  //   range_temp := range
  //   for index_temp = 0; index_temp < len_temp; index_temp++ {
  //           value_temp = range_temp[index_temp]
  //           index = index_temp
  //           value = value_temp
  //           original body
  //   }

slice:

  // The loop we generate:
  //   for_temp := range
  //   len_temp := len(for_temp)
  //   for index_temp = 0; index_temp < len_temp; index_temp++ {
  //           value_temp = for_temp[index_temp]
  //           index = index_temp
  //           value = value_temp
  //           original body
  //   }
  //
  // Using for_temp means that we don't need to check bounds when
  // fetching range_temp[index_temp].

What they have in common is:

  • All types of ranges are essentially C-style loops
  • The traversed value will be assigned to a temporary variable

Summarize:

  • Loop variables are assigned and reused on each iteration.
  • Elements can be removed from a map or added to a map during iteration. Added elements are not necessarily traversed in subsequent iterations.

Now let's go back to our opening example. 1. The answer is that the program can end normally. It can actually be roughly translated into something like the following:

for_temp := v
len_temp := len(for_temp)
for index_temp = 0; index_temp < len_temp; index_temp++ {
        value_temp = for_temp[index_temp]
        index = index_temp
        value = value_temp
        v = append(v, index)
}

2. Look at the output first

IndexArray 3 100
IndexValueArray 3 4
IndexValueArrayPtr 3 100

We know that a slice is actually syntactic sugar for a struct that has a pointer member to an array. Make a copy of this structure and assign it to for_temp before the loop starts, and the subsequent loop is actually iterating over for_temp. Any changes to the original variable v itself (rather than the array it refers back to) have nothing to do with the resulting copy for_temp. But the array behind it is still shared as a pointer to v and for_temp, so a statement like v[i] = 1 will still work. Similar to the above example, the array is assigned to a temporary variable before the loop starts. When the range loop is performed on the array, the temporary variable stores a copy of the entire array, and operations on the original array will not be reflected on the copy. When the range loop is performed on the array pointer, the temporary variable stores a copy of the pointer, and the same memory space is operated.

Attachment: a deeper understanding

Let's take another example below to see if you really get it.

type Foo struct {
    bar string
}
func main() {
    list := []Foo{
        {"A"},
        {"B"},
        {"C"},
    }
    list2 := make([]*Foo, len(list))
    for i, value := range list {
        list2[i] = &value
    }
    fmt.Println(list[0], list[1], list[2])
    fmt.Println(list2[0], list2[1], list2[2])
}

In this example, we do the following:

  • A structure called Foo is defined, which has a field called bar. Then, we created a slice based on the Foo struct named list
  • We also create a slice based on the Foo struct pointer type called list2
  • In a for loop, we try to traverse each element in the list, get its pointer address, and assign it to the position corresponding to index in list2.
  • Finally, output each element in list and list2 respectively

From the code point of view, of course, the result we expect should be like this:

{A} {B} {C}
&{A} &{B} &{C}

But the result is unexpected, the output of the program is this:

{A} {B} {C}
&{C} &{C} &{C}

In Go's for...range loop, Go always uses a value copy instead of the element being traversed. In short, the value in for...range is a value copy, not the element itself. In this way, when we expect to use & to get the address of the element, we actually only get the address of the temporary variable value, not the address of an element in the list that is actually traversed. In the entire for...range loop, the temporary variable value will be reused, so in the above example, list2 is filled with three identical addresses, which are actually the addresses of value. And in the last loop, value is assigned as {c}. Therefore, the output of list2 shows three &{c}.

Similarly, the following is written in exactly the same way as the for...range example:

var value Foo
for var i := 0; i < len(list); i++ {
    value = list[i]
    list2[i] = &value
}

So, what is the correct spelling? We should use index to access the real element in the for...range and get its pointer address:

for i, _ := range list {
    list2[i] = &list[i]
}

In this way, by outputting the elements in list2, we can get the result we want (&{A} &{B} &{C}).

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324123991&siteId=291194637