Golang Interface Implementation Principle Analysis

Interface Analysis


This article is based on the analysis of go1.12.12 source code, the code is run and debugged on the amd64 machine

1. Duck Typing

1.1 What is duck typing

1

(Source: Baidu Encyclopedia)

Is the big yellow duck in the picture a duck? From a traditional point of view, the big yellow duck in the picture is not a duck, because it can neither scream nor run, and it is not even alive

First look at the definition of duck type, taken from Wikipedia

If it walks like a duck and it quacks like a duck, then it must be a duck

If something walks like a duck and quacks like a duck, it must be a duck

So, from Duck Typingthe point of view, the big yellow duck in the picture is a duck

Duck typing, a style of type inference in programming that describes the external behavior of things rather than their internal structure

1.2 Duck typing in Go language

The Go language is implemented through interfaces Duck Typing. Unlike other dynamic languages, where type mismatches can only be checked at runtime, and unlike most static languages, where you need to explicitly declare which interface to implement, Go language interfaces are unique in that they are隐式实现

2. Overview

2.1 Interface type

An interface is a kind of interface 抽象类型, which does not expose the layout or internal structure of the data it contains, and of course there is no basic operation of the data, and only some methods are provided. When you get a variable of interface type, you have no way of knowing what it is, but you can know what it can do, or more precisely, just what methods it provides

2.2 Interface definition

The Go language provides interfacethe keyword , and the interface can only define the methods that need to be implemented, and cannot contain any variables

type 接口类型名 interface{
    
    
    方法名1( 参数列表1 ) 返回值列表1
    方法名2( 参数列表2 ) 返回值列表2}

For example, io.Writerthe type is actually the interface type

type Writer interface {
    
    
    Write(p []byte) (n int, err error)
}

New interfaces can be nested between interfaces, such as io.ReadWriter:

type Reader interface {
    
    
    Read(p []byte) (n int, err error)
}

type ReadWriter interface{
    
    
    Reader
    Writer
}

An interface that does not contain methods is called an empty interface type

interface{
    
    }

2.3 Implement the interface

A concrete type implements an interface if it implements all the methods required by the interface. When an expression implements an interface, the expression can only be copied to the interface

In the following example, define an Runnerinterface that contains only one run()method, and Personthe structure implements Run()the method , then Runnerthe interface is implemented

type Runner interface {
    
    
    Run()
}

type Person struct {
    
    
    Name string
}

func (p Person) Run() {
    
    
    fmt.Printf("%s is running\n", p.Name)
}

func main() {
    
    
    var r Runner
    r = Person{
    
    Name: "song_chh"}
    r.Run()
}

In addition, because the empty interface type is an interface that does not define any methods, all types implement the empty interface, which means that any value can be assigned to the empty interface type

2.4 Interfaces and pointers

When an interface defines a set of methods, it does not limit the recipients of the implementation, so there are two implementation methods, one is a pointer receiver, and the other is a value receiver

Two implementations of the same method cannot exist at the same time

Add a Say()method , and the Person structure type uses the pointer receiver to implement the Say() method

type Runner interface {
    
    
    Run()
    Say()
}

type Person struct {
    
    
    Name string
}

func (p Person) Run() {
    
    
    fmt.Printf("%s is running\n", p.Name)
}

func (p *Person) Say() {
    
    
    fmt.Printf("hello, %s", p.Name)
}

When initializing interface variables, you can use structures or structure pointers

var r Runner
r = &Person{
    
    Name: "sch_chh"}
r = Person{
    
    Nmae: "sch_chh"}

Because both the receiver type that implements the interface and the type when the interface is initialized have two dimensions, four different encodings are generated

  • | value receiver| pointer receiver—
    |—|
    —value initialization| √ | ×
    pointer initialization| √ | √

×Indicates that the compilation failed

The following two situations can be well understood through compilation:

  • Both method receiver and initializer type are struct values
  • The method receiver and initialization type are both structure pointers

First, let's take a look at the situation that can pass compilation, that is, the method receiver is a structure, and the initialized variable is a pointer type

type Runner interface {
    
    
    Run()
    Say()
}

type Person struct {
    
    
    Name string
}

func (p Person) Run() {
    
    
    fmt.Printf("%s is running\n", p.Name)
}

func (p *Person) Say() {
    
    
    fmt.Printf("hello, %s", p.Name)
}

func main() {
    
    
    var r Runner
    r = &Person{
    
    Name: "sch_chh"}
    r.Run()
    r.Say()
}

In the above code, Personthe structure pointer can be called directly Run, Saybecause as a structure pointer, the underlying structure can be obtained implicitly, and then the corresponding method is called through the structure

If the reference is removed, the variable initialization uses the structure type

r = Person{
    
    Name: "sch_chh"}

It will prompt that the compilation fails

./pointer.go:24:4: cannot use Person literal (type Person) as type Runner in assignment:
        Person does not implement Runner (Say method has pointer receiver)

So why does the compilation fail? First of all, in the Go language, parameter passing is值传递

When the variable in the code &Person{}is , the parameters will be copied during the method call, and a new Personstructure pointer will be created, which points to a certain structure, so the compiler will implicitly dereference the variable to obtain the pointer to The structure, to complete the method call

2

When the variable in the code Person{}is , the parameters will be copied during the method call, that is, Run()and Say()will accept a new Person{}variable. If the method receiver is *Person, the compiler cannot find a unique pointer based on the structure, so the compiler will report an error

3

Note: For a variable of specific type T, it is also legal to directly call the *T method, because the compiler will implicitly complete the address fetching operation for you, but this is just a syntactic sugar

2.5 nil和non-nil

Look at another example, still the Runner interface and the Person structure, pay attention to the main() function body, first declare an interface variable r, print whether it is nil, then define a *Person type p, print whether p is nil, and finally Assign p to r, and print whether r is nil at this time

type Runner interface {
    
    
    Run()
}

type Person struct {
    
    
    Name string
}

func (p Person) Run() {
    
    
    fmt.Printf("%s is running\n", p.Name)
}

func main() {
    
    
    var r Runner
    fmt.Println("r:", r == nil)

    var p *Person
    fmt.Println("p:", p == nil)

    r = p 
    fmt.Println("r:", r == nil)
}

What is the output?

r: true or false
p: true or false
r: true or false

The actual output is:

r: true
p: true
r: false

It is understandable that the first two outputs r is nil and p is nil, because the zero value of interface type and pointer type is nil, so when p is assigned to r, r is not nil? In fact, there is a concept of interface value

2.6 Interface Values

Conceptually speaking, a value of an interface type (abbreviated as an interface value) actually has two parts: namely 具体类型and are 该类型的值called the sum of the interface , so if and only if the dynamic type and dynamic value of the interface are both nil, The interface value is nil动态类型动态值

Going back to the example in 2.5, when p is assigned to the r interface, the actual structure of r is shown in the figure

4

To verify whether this is really the case, add a line of code at the end of the main function body

fmt.Printf("r type: %T, data: %v\n", r, r)

operation result

r type: *main.Person, data: <nil>

You can see that the dynamic value is indeed nil

Now that we know the concept of interface value, what is the specific implementation of the underlying interface?

3. Implementation principle

The interface type in the Go language will be 是否包含一组方法divided into two different implementations, namely ifacea structure containing a set of methods and efacea structure without any methods

3.1 iface

The bottom layer of iface is a structure, which is defined as follows:

//runtime/runtime2.go
type iface struct {
    
    
        tab  *itab
        data unsafe.Pointer
}

There are two pointers inside iface, one is the itab structure pointer, and the other is the pointer to the data

The unsafe.Pointer type is a special type of pointer that can store the address of any variable (similar to void* in C)

//runtime/runtime2.go
type itab struct {
    
     
        inter *interfacetype
        _type *_type
        hash  uint32 // copy of _type.hash. Used for type switches.
        _     [4]byte
        fun   [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}

itab is used to indicate the relationship between the specific type and the interface type, where interis the interface type definition information, _typeis the specific type information, hashand is a copy of _type.hash. During type conversion, quickly judge whether the target type is consistent with the type in the interface fun. Method address list, although fun is an array with a fixed length of 1, but this is actually a flexible array, the number of stored elements is uncertain, and multiple methods are sorted in dictionary order

//runtime/type.go
type interfacetype struct {
    
    
        typ     _type
        pkgpath name
        mhdr    []imethod
}
```go
interfacetype是描述接口定义的信息,`_type`:接口的类型信息,`pkgpath`是定义接口的包名;,`mhdr`是接口中定义的函数表,按字典序排序

> 假设接口有ni个方法,实现接口的结构体有nt个方法,那么itab函数表生成时间复杂为O(ni*nt),如果接口方法列表和结构体方法列表有序,那么函数表生成时间复杂度为O(ni+nt)

```go
//runtime/type.go
type _type struct {
    
    
        size       uintptr
        ptrdata    uintptr // size of memory prefix holding all pointers
        hash       uint32
        tflag      tflag
        align      uint8
        fieldalign uint8
        kind       uint8
        alg        *typeAlg
        // gcdata stores the GC type data for the garbage collector.
        // If the KindGCProg bit is set in kind, gcdata is a GC program.
        // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
        gcdata    *byte
        str       nameOff
        ptrToThis typeOff
}

_type is a common description for all types. sizeIt is the size of the type, which hashis the hash value of the type; tflagit is the tags of the type, related to reflection, alignand fieldalignrelated to memory alignment, and kindis the type number. The specific definition is located in runtime/typekind.go, which gcdatais gc related information

The structure diagram of the whole iface is as follows:

5

3.2 eface

Compared with iface, eface structure is relatively simple

//runtime/runtime2.go
type eface struct {
    
    
        _type *_type
        data  unsafe.Pointer
}

There are also two pointers inside eface, a pointer to the specific type information _type structure, and a pointer to data

6

3.3 Concrete type conversion to interface type

So far, what is an interface, the underlying structure of the interface, and how is the conversion performed when the specific type is assigned to the interface type? Let's look at the example in the interface implementation

  1 package main
  2 
  3 import "fmt"
  4 
  5 type Runner interface {
    
    
  6     Run()
  7 }
  8 
  9 type Person struct {
    
    
 10     Name string
 11 }
 12 
 13 func (p Person) Run() {
    
    
 14     fmt.Printf("%s is running\n", p.Name)
 15 }
 16 
 17 func main() {
    
    
 18     var r Runner
 19     r = Person{
    
    Name: "song_chh"}
 20     r.Run()
 21 }

Generate assembly code through the tools provided by Go

go tool compile -S interface.go

Only intercept the code related to line 19

0x001d 00029 (interface.go:19)  PCDATA  $2, $0
0x001d 00029 (interface.go:19)  PCDATA  $0, $1
0x001d 00029 (interface.go:19)  XORPS   X0, X0
0x0020 00032 (interface.go:19)  MOVUPS  X0, ""..autotmp_1+32(SP)
0x0025 00037 (interface.go:19)  PCDATA  $2, $1
0x0025 00037 (interface.go:19)  LEAQ    go.string."song_chh"(SB), AX
0x002c 00044 (interface.go:19)  PCDATA  $2, $0
0x002c 00044 (interface.go:19)  MOVQ    AX, ""..autotmp_1+32(SP)
0x0031 00049 (interface.go:19)  MOVQ    $8, ""..autotmp_1+40(SP)
0x003a 00058 (interface.go:19)  PCDATA  $2, $1
0x003a 00058 (interface.go:19)  LEAQ    go.itab."".Person,"".Runner(SB), AX
0x0041 00065 (interface.go:19)  PCDATA  $2, $0
0x0041 00065 (interface.go:19)  MOVQ    AX, (SP)
0x0045 00069 (interface.go:19)  PCDATA  $2, $1
0x0045 00069 (interface.go:19)  PCDATA  $0, $0
0x0045 00069 (interface.go:19)  LEAQ    ""..autotmp_1+32(SP), AX
0x004a 00074 (interface.go:19)  PCDATA  $2, $0
0x004a 00074 (interface.go:19)  MOVQ    AX, 8(SP)
0x004f 00079 (interface.go:19)  CALL    runtime.convT2I(SB)
0x0054 00084 (interface.go:19)  MOVQ    16(SP), AX
0x0059 00089 (interface.go:19)  PCDATA  $2, $2
0x0059 00089 (interface.go:19)  MOVQ    24(SP), CX

It can be seen that the compiler calls the conversion function after constructing itab runtime.convT2I(SB), see the implementation of the function

//runtime/iface.go
func convT2I(tab *itab, elem unsafe.Pointer) (i iface) {
    
    
        t := tab._type
        if raceenabled {
    
    
                raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2I))
        }
        if msanenabled {
    
    
                msanread(elem, t.size)
        }
        x := mallocgc(t.size, t, true)
        typedmemmove(t, x, elem)
        i.tab = tab
        i.data = x
        return
}

mallocgcFirst, call to apply for a piece of memory space according to the size of the type elem, copy the content of the pointer to the new space, assign the tab to the tab of iface, and assign the new memory pointer to the data of iface, so that an iface is created

Slightly change the sample code to assign a variable of structure pointer type to an interface variable

 19     r = &Person{
    
    Name: "song_chh"}

Generate assembly code again through the tool

go tool compile -S interface.go

View the following assembly code

0x001d 00029 (interface.go:19)  PCDATA  $2, $1
0x001d 00029 (interface.go:19)  PCDATA  $0, $0
0x001d 00029 (interface.go:19)  LEAQ    type."".Person(SB), AX
0x0024 00036 (interface.go:19)  PCDATA  $2, $0
0x0024 00036 (interface.go:19)  MOVQ    AX, (SP)
0x0028 00040 (interface.go:19)  CALL    runtime.newobject(SB)
0x002d 00045 (interface.go:19)  PCDATA  $2, $2
0x002d 00045 (interface.go:19)  MOVQ    8(SP), DI
0x0032 00050 (interface.go:19)  MOVQ    $8, 8(DI)
0x003a 00058 (interface.go:19)  PCDATA  $2, $-2
0x003a 00058 (interface.go:19)  PCDATA  $0, $-2
0x003a 00058 (interface.go:19)  CMPL    runtime.writeBarrier(SB), $0
0x0041 00065 (interface.go:19)  JNE     105
0x0043 00067 (interface.go:19)  LEAQ    go.string."song_chh"(SB), AX
0x004a 00074 (interface.go:19)  MOVQ    AX, (DI)

First, the compiler obtains Personthe structure type pointer, calls runtime.newobject()the function as a parameter, and also checks the function definition in the source code

// runtime/malloc.go

// implementation of new builtin
// compiler (both frontend and SSA backend) knows the signature
// of this function
func newobject(typ *_type) unsafe.Pointer {
    
    
        return mallocgc(typ.size, typ, true)
}

newobject takes *Person as an input parameter, creates a new Personstructure pointer, and sets its variables, and then the compiler generates iface

In addition convT2Ito functions, in fact runtime/runtime.go, there are many definitions of conversion functions in the file

// Non-empty-interface to non-empty-interface conversion.
func convI2I(typ *byte, elem any) (ret any)

// Specialized type-to-interface conversion.
// These return only a data pointer.
func convT16(val any) unsafe.Pointer     // val must be uint16-like (same size and alignment as a uint16)
func convT32(val any) unsafe.Pointer     // val must be uint32-like (same size and alignment as a uint32)
func convT64(val any) unsafe.Pointer     // val must be uint64-like (same size and alignment as a uint64 and contains no pointers)
func convTstring(val any) unsafe.Pointer // val must be a string
func convTslice(val any) unsafe.Pointer  // val must be a slice

// Type to empty-interface conversion.
func convT2E(typ *byte, elem *any) (ret any)
func convT2Enoptr(typ *byte, elem *any) (ret any)

// Type to non-empty-interface conversion.   
func convT2I(tab *byte, elem *any) (ret any)        //for the general case
func convT2Inoptr(tab *byte, elem *any) (ret any)   //for structs that do not contain pointers

convT2InoptrIt is used for the conversion without pointer inside the structure. noptr can be understood as no pointer. The conversion process is similar convT2Ito
that ofconvT16convT32convT64convTstringconvTsliceconvT64

//runtime/iface.go
func convT64(val uint64) (x unsafe.Pointer) {
    
    
        if val == 0 {
    
    
                x = unsafe.Pointer(&zeroVal[0])
        } else {
    
    
                x = mallocgc(8, uint64Type, false)
                *(*uint64)(x) = val
        }
        return
}

Compared with convT2a series of functions, it lacks calls typedmemmoveto memmovefunctions and reduces memory copying. In addition, if the value is a zero value of this type, it will not call mallocgcto apply for a new memory, and directly return the pointed zeroVal[0]pointer

Let's look at the empty interface conversion functionconvT2E

func convT2E(t *_type, elem unsafe.Pointer) (e eface) {
    
    
        if raceenabled {
    
    
                raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2E))
        }
        if msanenabled {
    
    
                msanread(elem, t.size)
        }
        x := mallocgc(t.size, t, true)
        // TODO: We allocate a zeroed object only to overwrite it with actual data.
        // Figure out how to avoid zeroing. Also below in convT2Eslice, convT2I, convT2Islice.
        typedmemmove(t, x, elem)
        e._type = t
        e.data = x
        return
}

convT2EconvT2ISimilar to , *_typeit is also generated by the compiler when converted to eface, and is called as an input parameterconvT2E

3.4 Affirmations

The content of the previous section mainly introduces how to convert a concrete type into an interface type, so how to convert a concrete type into an interface type? The Go language provides two methods, namely 类型断言and类型分支

type assertion

There are two ways to write type assertions

    v := x.(T)
v, ok := x.(T)
  • x: is an expression of interface type
  • T: is a known type

Pay attention to the first way of writing, if the type assertion fails, painc will be triggered

type switch

switch x := x.(type) {
    
     /* ... */}

Example of use

switch i.(type) {
    
    
case string:
    fmt.Println("i'm a string")
case int:
    fmt.Println("i'm a int")
default:
    fmt.Println("unknown")
} 

4. References

【1】 "Go Programming Language" Machinery Industry Press

[2] "Analysis of the bottom layer of interface in golang"

[3] "On the Principles of Go Language Implementation"

[4] "Deep Deciphering Go Language 10 Questions About Interface"

Guess you like

Origin blog.csdn.net/zhw21w/article/details/129488201