Golang's array slices are silly and unclear

array

Go developers use slices a lot in their daily work. Before introducing slices, let's first understand arrays. I believe everyone is familiar with arrays. The data structure of arrays is relatively simple, and it is continuous in memory. Take an array of 10 numbers as an example:

a:=[10]int{0,1,2,3,4,5,6,7,8,9}

It looks like this in memory:

image.pngThanks to the continuity, the characteristics of the array are:

  • fixed size
  • The access is fast, and the complexity is O(1);
  • Inserting and deleting elements is slower than querying because of moving elements.

When we want to access an element of an out-of-bounds element, go doesn't even edit:

a := [10]int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
fmt.Println(a[10])
// invalid array index 10 (out of bounds for 10-element array)

slice

Compared with arrays, go's slices (slices) are relatively flexible. The big difference is that the length of slices can not be fixed. There is no need to specify the length when creating them. In go, slices are a designed data structure:

type slice struct {
   array unsafe.Pointer //指针
   len   int //长度
   cap   int //容量
}

The bottom layer of a slice is actually an array. The pointer points to the underlying array. len is the length of the slice, and cap is the capacity of the slice. When adding elements to the slice, and the capacity of the cap is insufficient, the capacity will be expanded according to the policy.

image.png

Creation of slices

direct statement

var s []int

Through the directly declared slice, it is a nilslice, its length and capacity are 0, and it does not point to any underlying array, nil slice and empty slice are different, which will be introduced next.

new method initialization

s:=*new([]int) 

The new method is not much different from the direct declaration method, and the final output is a nil slice.

literal

s1 := []int{0, 1, 2}
s2 := []int{0, 1, 2, 4: 4}
s3 := []int{0, 1, 2, 4: 4, 5, 6, 9: 9}
fmt.Println(s1, len(s1), cap(s1)) //[0 1 2] 3 3
fmt.Println(s2, len(s2), cap(s2)) //[0 1 2 0 4] 5 5
fmt.Println(s3, len(s3), cap(s3)) //[0 1 2 0 4 5 6 0 0 9] 10 10

The default length and capacity of slices created by literals are equal. It should be noted that if we specify the value of an index separately, then if the element before the index value is not declared, it will be the default type of slice. value.

make method

s := make([]int, 5, 6)
fmt.Println(s, len(s), cap(s)) //[0 0 0 0 0] 5 6

The length and capacity of the slice can be specified by make.

interception method

Slices can be obtained from arrays or other slices. At this time, the new slice will share an underlying array with the old array or slice. No matter who modifies the data, it will affect the underlying array, but if the new slice is expanded, Then the underlying arrays are not the same.

s[:]

a := []int{0, 1, 2, 3, 4}
b := a[:]
fmt.Println(b, len(b), cap(b)) //[0 1 2 3 4] 5 5

The slice obtained by is: equivalent to a reference to the entire slice.[0,len(a)-1]

s[i:]

a := []int{0, 1, 2, 3, 4}
b := a[1:]
fmt.Println(b, len(b), cap(b)) //[1 2 3 4] 4 4

通过指定切片的开始位置来获取切片,它是左闭的包含左边的元素,此时它的容量cap(b)=cap(a)-i。这里要注意界限问题,a[5:]的话,相当于走到数组的尾巴处,什么元素也没了,此时就是个空切片,但是如果你用a[6:]的话,那么就会报错,超出了数组的界限。

a := []int{0, 1, 2, 3, 4}
b := a[5:] //[]
c := a[6:] //runtime error: slice bounds out of range [6:5]

c虽然报错了,但是它只是运行时报错,编译还是能通过的

s[:j]

a := []int{0, 1, 2, 3, 4}
b := a[:4]
fmt.Println(b, len(b), cap(b)) //[0 1 2 3] 4 5

获取[0-j)的数据,注意右边是开区间,不包含j,同时它的cap和j没关系,始终是cap(b) = cap(a),同样注意不要越界。

s[i:j]

a := []int{0, 1, 2, 3, 4}
b := a[2:4]
fmt.Println(b, len(b), cap(b)) //[2 3] 2 3

获取[i-j)的数据,注意右边是开区间,不包含j,它的cap(b) = cap(a)-i

s[i:j:x]

a := []int{0, 1, 2, 3, 4}
b := a[1:2:3]
fmt.Println(b, len(b), cap(b)) //[1] 1 2

通过上面的例子,我们可以发现切片b的cap其实和j没什么关系,和i存在关联,不管j是什么,始终是cap(b)=cap(a)-ix的出现可以修改b的容量,当我们设置x后,cap(b) = x-i而不再是cap(a)-i了。

看个例子

s0 := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
s1 := s0[3:6] //[3 4 5] 3 7

s1是对s0的切片,所以它们大概是这样:

image.png

s2 := s1[1:3:4]

这时指定个s2,s2是对s1的切片,并且s2的len=2,cap=3,所以大概长这样:

image.png

s1[1] = 40
fmt.Println(s0, s1, s2)// [0 1 2 3 40 5 6 7 8 9] [3 40 5] [40 5]

这时把s1[1]修改成40,因为没有涉及到扩容,s0、s1、s2重叠部分都指向同一个底层数组,所以最终发现s0、s2对应的位置都变成了40。

image.png

s2 = append(s2, 10)
fmt.Println(s2, len(s2), cap(s2)) //[40 5 10] 3 3

再向s2中添加一个元素,因为s2还有一个空间,所以不用发生扩容。

image.png

s2 = append(s2, 11)
fmt.Println(s2, len(s2), cap(s2)) //[40 5 10 11] 4 6

继续向s2中添加一个元素,此时s2已经没有空间了,所以会触发扩容,扩容后指向一个新的底层数据,和原来的底层数组解耦了。

image.png 此时无论怎么修改s2都不会影响到s1和s2。

切片的扩容

slice的扩容主要通过growslice函数上来处理的:

func growslice(et *_type, old slice, cap int) slice {
    ....
    newcap := old.cap
    doublecap := newcap + newcap
    if cap > doublecap {
            newcap = cap
    } else {
        if old.len < 1024 {
              newcap = doublecap
        } else {
            // Check 0 < newcap to detect overflow
            // and prevent an infinite loop.
            for 0 < newcap && newcap < cap {
                  newcap += newcap / 4
            }
            // Set newcap to the requested cap when
            // the newcap calculation overflowed.
            if newcap <= 0 {
                 newcap = cap
            }
        }
    }
    ....
    return slice{p, old.len, newcap}
}

入参说明下:

  1. et是slice的类型。
  2. old是老的slice。
  3. cap是扩容后的最低容量,比如原来是4,append加了一个,那么cap就是5。

所以上面的代码解释为:

  1. 如果扩容后的最低容量大于老的slice的容量的2倍,那么新的容量等于扩容后的最低容量。
  2. 如果老的slice的长度小于1024,那么新的容量就是老的slice的容量的2倍
  3. 如果老的slice的长度大于等于1024,那么新的容量就等于的容量不停的1.25倍,直至大于扩容后的最低容量。

这里需要说明下关于slice的扩容网上很多文章都说小于1024翻倍扩容,大于1024每次1.25倍扩容,其实就是基于这段代码,但其实这不全对,我们来看个例子:

a := []int{1, 2}
fmt.Println(len(a), cap(a)) //2 2
a = append(a, 2, 3, 4)
fmt.Println(len(a), cap(a)) // 5 6

按照规则1,这时的cap应该是5,结果是6。

a := make([]int, 1280, 1280)
fmt.Println(len(a), cap(a)) //1280 1280
a = append(a, 1)
fmt.Println(len(a), cap(a), 1280*1.25) //1281 1696 1600

按照规则3,这时的cap应该是原来的1.25倍,即1600,结果是1696。

内存对齐

其实上面两个扩容,只能说不是最终的结果,go还会做一些内存对齐的优化,通过内存对齐可以提升读取的效率。

// 内存对齐
capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
capmem = roundupsize(capmem)
newcap = int(capmem / et.size)

空切片和nil切片

空切片:slice的指针不为空,len和cap都是0
nil切片:slice的指针不指向任何地址即array=0,len和cap都是0

nil
var a []int a:=make([]int,0)
a:=*new([]int) a:=[]int{}

空切片虽然地址不为空,但是这个地址也不代表任何底层数组的地址,空切片在初始化的时候会指向一个叫做zerobase的地址,

var zerobase uintptr
if size == 0 {
      return unsafe.Pointer(&zerobase)
}

所有空切片的地址都是一样的。

var a1 []int
a2:=*new([]int)
a3:=make([]int,0)
a4:=[]int{}

fmt.Println(*(*[3]int)(unsafe.Pointer(&a1))) //[0 0 0]
fmt.Println(*(*[3]int)(unsafe.Pointer(&a2))) //[0 0 0]
fmt.Println(*(*[3]int)(unsafe.Pointer(&a3))) //[824634101440 0 0]
fmt.Println(*(*[3]int)(unsafe.Pointer(&a4))) //[824634101440 0 0]

数组是值传递,切片是引用传递?

func main() {
   array := [10]int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
   slice := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
   changeArray(array)
   fmt.Println(array) //[0 1 2 3 4 5 6 7 8 9]
   changeSlice(slice)
   fmt.Println(slice) //[1 1 2 3 4 5 6 7 8 9]
}

func changeArray(a [10]int) {
   a[0] = 1
}

func changeSlice(a []int) {
   a[0] = 1
}
  • 定义一个数组和一个切片
  • 通过changeArray改变数组下标为0的值
  • 通过changeSlice改变切片下标为0的值
  • 原数组值未被修改,原切片的值已经被修改

这个表象看起来像是slice是指针传递似的,但是如果我们这样呢:


func main() {
   slice := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
   changeSlice(slice)//[0 1 2 3 4 5 6 7 8 9]
}
func changeSlice(a []int) {
   a = append(a, 99)
}

It will be found that the value of the original slice has not been changed. This is because we use append. After appending, the capacity of the original slice is not enough. At this time, a new array will be copied. In fact, go's function parameters are passed, only by value, not by reference. When the underlying data of the slice has not changed, how to modify it will affect the original underlying array. When the slice expands, it will be a new array after expansion, so how to modify this The new array will not affect the original array.

Can arrays and slices be compared?

Only arrays of the same length and type can be compared

a:=[2]int{1,2}
b:=[2]int{1,2}
fmt.Println(a==b) true

a:=[2]int{1,2}
b:=[3]int{1,2,3}
fmt.Println(a==b) //invalid operation: a == b (mismatched types [2]int and [3]int)

a:=[2]int{1,2}
b:=[2]int8{1,2}
fmt.Println(a==b) //invalid operation: a == b (mismatched types [2]int and [2]int8)

slice can only be compared with nil, the rest cannot be compared

a:=[]int{1,2}
b:=[]int{1,2}
fmt.Println(a==b)//invalid operation: a == b (slice can only be compared to nil)

But it should be noted that two slices that are both nil cannot be compared, it can only be compared with nil, where nil is the real nil.

var a []int
var b []int
fmt.Println(a == b) //invalid operation: a == b (slice can only be compared to nil)
fmt.Println(a == nil) //true

WeChat search "Pretend to understand programming"

Guess you like

Origin juejin.im/post/7121628307040403487