Interface Analysis
This article is based on the analysis of go1.12.12 source code, the code is run and debugged on the amd64 machine
1. Duck Typing
1.1 What is duck typing
(Source: Baidu Encyclopedia)
Is the big yellow duck in the picture a duck? From a traditional point of view, the big yellow duck in the picture is not a duck, because it can neither scream nor run, and it is not even alive
First look at the definition of duck type, taken from Wikipedia
If it walks like a duck and it quacks like a duck, then it must be a duck
If something walks like a duck and quacks like a duck, it must be a duck
So, from Duck Typing
the point of view, the big yellow duck in the picture is a duck
Duck typing, a style of type inference in programming that describes the external behavior of things rather than their internal structure
1.2 Duck typing in Go language
The Go language is implemented through interfaces Duck Typing
. Unlike other dynamic languages, where type mismatches can only be checked at runtime, and unlike most static languages, where you need to explicitly declare which interface to implement, Go language interfaces are unique in that they are隐式实现
2. Overview
2.1 Interface type
An interface is a kind of interface 抽象类型
, which does not expose the layout or internal structure of the data it contains, and of course there is no basic operation of the data, and only some methods are provided. When you get a variable of interface type, you have no way of knowing what it is, but you can know what it can do, or more precisely, just what methods it provides
2.2 Interface definition
The Go language provides interface
the keyword , and the interface can only define the methods that need to be implemented, and cannot contain any variables
type 接口类型名 interface{
方法名1( 参数列表1 ) 返回值列表1
方法名2( 参数列表2 ) 返回值列表2
…
}
For example, io.Writer
the type is actually the interface type
type Writer interface {
Write(p []byte) (n int, err error)
}
New interfaces can be nested between interfaces, such as io.ReadWriter
:
type Reader interface {
Read(p []byte) (n int, err error)
}
type ReadWriter interface{
Reader
Writer
}
An interface that does not contain methods is called an empty interface type
interface{
}
2.3 Implement the interface
A concrete type implements an interface if it implements all the methods required by the interface. When an expression implements an interface, the expression can only be copied to the interface
In the following example, define an Runner
interface that contains only one run()
method, and Person
the structure implements Run()
the method , then Runner
the interface is implemented
type Runner interface {
Run()
}
type Person struct {
Name string
}
func (p Person) Run() {
fmt.Printf("%s is running\n", p.Name)
}
func main() {
var r Runner
r = Person{
Name: "song_chh"}
r.Run()
}
In addition, because the empty interface type is an interface that does not define any methods, all types implement the empty interface, which means that any value can be assigned to the empty interface type
2.4 Interfaces and pointers
When an interface defines a set of methods, it does not limit the recipients of the implementation, so there are two implementation methods, one is a pointer receiver, and the other is a value receiver
Two implementations of the same method cannot exist at the same time
Add a Say()
method , and the Person structure type uses the pointer receiver to implement the Say() method
type Runner interface {
Run()
Say()
}
type Person struct {
Name string
}
func (p Person) Run() {
fmt.Printf("%s is running\n", p.Name)
}
func (p *Person) Say() {
fmt.Printf("hello, %s", p.Name)
}
When initializing interface variables, you can use structures or structure pointers
var r Runner
r = &Person{
Name: "sch_chh"}
r = Person{
Nmae: "sch_chh"}
Because both the receiver type that implements the interface and the type when the interface is initialized have two dimensions, four different encodings are generated
- | value receiver| pointer receiver—
|—|
—value initialization| √ | ×
pointer initialization| √ | √
×
Indicates that the compilation failed
The following two situations can be well understood through compilation:
- Both method receiver and initializer type are struct values
- The method receiver and initialization type are both structure pointers
First, let's take a look at the situation that can pass compilation, that is, the method receiver is a structure, and the initialized variable is a pointer type
type Runner interface {
Run()
Say()
}
type Person struct {
Name string
}
func (p Person) Run() {
fmt.Printf("%s is running\n", p.Name)
}
func (p *Person) Say() {
fmt.Printf("hello, %s", p.Name)
}
func main() {
var r Runner
r = &Person{
Name: "sch_chh"}
r.Run()
r.Say()
}
In the above code, Person
the structure pointer can be called directly Run
, Say
because as a structure pointer, the underlying structure can be obtained implicitly, and then the corresponding method is called through the structure
If the reference is removed, the variable initialization uses the structure type
r = Person{
Name: "sch_chh"}
It will prompt that the compilation fails
./pointer.go:24:4: cannot use Person literal (type Person) as type Runner in assignment:
Person does not implement Runner (Say method has pointer receiver)
So why does the compilation fail? First of all, in the Go language, parameter passing is值传递
When the variable in the code &Person{}
is , the parameters will be copied during the method call, and a new Person
structure pointer will be created, which points to a certain structure, so the compiler will implicitly dereference the variable to obtain the pointer to The structure, to complete the method call
When the variable in the code Person{}
is , the parameters will be copied during the method call, that is, Run()
and Say()
will accept a new Person{}
variable. If the method receiver is *Person
, the compiler cannot find a unique pointer based on the structure, so the compiler will report an error
Note: For a variable of specific type T, it is also legal to directly call the *T method, because the compiler will implicitly complete the address fetching operation for you, but this is just a syntactic sugar
2.5 nil和non-nil
Look at another example, still the Runner interface and the Person structure, pay attention to the main() function body, first declare an interface variable r, print whether it is nil, then define a *Person type p, print whether p is nil, and finally Assign p to r, and print whether r is nil at this time
type Runner interface {
Run()
}
type Person struct {
Name string
}
func (p Person) Run() {
fmt.Printf("%s is running\n", p.Name)
}
func main() {
var r Runner
fmt.Println("r:", r == nil)
var p *Person
fmt.Println("p:", p == nil)
r = p
fmt.Println("r:", r == nil)
}
What is the output?
r: true or false
p: true or false
r: true or false
The actual output is:
r: true
p: true
r: false
It is understandable that the first two outputs r is nil and p is nil, because the zero value of interface type and pointer type is nil, so when p is assigned to r, r is not nil? In fact, there is a concept of interface value
2.6 Interface Values
Conceptually speaking, a value of an interface type (abbreviated as an interface value) actually has two parts: namely 具体类型
and are 该类型的值
called the sum of the interface , so if and only if the dynamic type and dynamic value of the interface are both nil, The interface value is nil动态类型
动态值
Going back to the example in 2.5, when p is assigned to the r interface, the actual structure of r is shown in the figure
To verify whether this is really the case, add a line of code at the end of the main function body
fmt.Printf("r type: %T, data: %v\n", r, r)
operation result
r type: *main.Person, data: <nil>
You can see that the dynamic value is indeed nil
Now that we know the concept of interface value, what is the specific implementation of the underlying interface?
3. Implementation principle
The interface type in the Go language will be 是否包含一组方法
divided into two different implementations, namely iface
a structure containing a set of methods and eface
a structure without any methods
3.1 iface
The bottom layer of iface is a structure, which is defined as follows:
//runtime/runtime2.go
type iface struct {
tab *itab
data unsafe.Pointer
}
There are two pointers inside iface, one is the itab structure pointer, and the other is the pointer to the data
The unsafe.Pointer type is a special type of pointer that can store the address of any variable (similar to void* in C)
//runtime/runtime2.go
type itab struct {
inter *interfacetype
_type *_type
hash uint32 // copy of _type.hash. Used for type switches.
_ [4]byte
fun [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}
itab is used to indicate the relationship between the specific type and the interface type, where inter
is the interface type definition information, _type
is the specific type information, hash
and is a copy of _type.hash. During type conversion, quickly judge whether the target type is consistent with the type in the interface fun
. Method address list, although fun is an array with a fixed length of 1, but this is actually a flexible array, the number of stored elements is uncertain, and multiple methods are sorted in dictionary order
//runtime/type.go
type interfacetype struct {
typ _type
pkgpath name
mhdr []imethod
}
```go
interfacetype是描述接口定义的信息,`_type`:接口的类型信息,`pkgpath`是定义接口的包名;,`mhdr`是接口中定义的函数表,按字典序排序
> 假设接口有ni个方法,实现接口的结构体有nt个方法,那么itab函数表生成时间复杂为O(ni*nt),如果接口方法列表和结构体方法列表有序,那么函数表生成时间复杂度为O(ni+nt)
```go
//runtime/type.go
type _type struct {
size uintptr
ptrdata uintptr // size of memory prefix holding all pointers
hash uint32
tflag tflag
align uint8
fieldalign uint8
kind uint8
alg *typeAlg
// gcdata stores the GC type data for the garbage collector.
// If the KindGCProg bit is set in kind, gcdata is a GC program.
// Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
gcdata *byte
str nameOff
ptrToThis typeOff
}
_type is a common description for all types. size
It is the size of the type, which hash
is the hash value of the type; tflag
it is the tags of the type, related to reflection, align
and fieldalign
related to memory alignment, and kind
is the type number. The specific definition is located in runtime/typekind.go, which gcdata
is gc related information
The structure diagram of the whole iface is as follows:
3.2 eface
Compared with iface, eface structure is relatively simple
//runtime/runtime2.go
type eface struct {
_type *_type
data unsafe.Pointer
}
There are also two pointers inside eface, a pointer to the specific type information _type structure, and a pointer to data
3.3 Concrete type conversion to interface type
So far, what is an interface, the underlying structure of the interface, and how is the conversion performed when the specific type is assigned to the interface type? Let's look at the example in the interface implementation
1 package main
2
3 import "fmt"
4
5 type Runner interface {
6 Run()
7 }
8
9 type Person struct {
10 Name string
11 }
12
13 func (p Person) Run() {
14 fmt.Printf("%s is running\n", p.Name)
15 }
16
17 func main() {
18 var r Runner
19 r = Person{
Name: "song_chh"}
20 r.Run()
21 }
Generate assembly code through the tools provided by Go
go tool compile -S interface.go
Only intercept the code related to line 19
0x001d 00029 (interface.go:19) PCDATA $2, $0
0x001d 00029 (interface.go:19) PCDATA $0, $1
0x001d 00029 (interface.go:19) XORPS X0, X0
0x0020 00032 (interface.go:19) MOVUPS X0, ""..autotmp_1+32(SP)
0x0025 00037 (interface.go:19) PCDATA $2, $1
0x0025 00037 (interface.go:19) LEAQ go.string."song_chh"(SB), AX
0x002c 00044 (interface.go:19) PCDATA $2, $0
0x002c 00044 (interface.go:19) MOVQ AX, ""..autotmp_1+32(SP)
0x0031 00049 (interface.go:19) MOVQ $8, ""..autotmp_1+40(SP)
0x003a 00058 (interface.go:19) PCDATA $2, $1
0x003a 00058 (interface.go:19) LEAQ go.itab."".Person,"".Runner(SB), AX
0x0041 00065 (interface.go:19) PCDATA $2, $0
0x0041 00065 (interface.go:19) MOVQ AX, (SP)
0x0045 00069 (interface.go:19) PCDATA $2, $1
0x0045 00069 (interface.go:19) PCDATA $0, $0
0x0045 00069 (interface.go:19) LEAQ ""..autotmp_1+32(SP), AX
0x004a 00074 (interface.go:19) PCDATA $2, $0
0x004a 00074 (interface.go:19) MOVQ AX, 8(SP)
0x004f 00079 (interface.go:19) CALL runtime.convT2I(SB)
0x0054 00084 (interface.go:19) MOVQ 16(SP), AX
0x0059 00089 (interface.go:19) PCDATA $2, $2
0x0059 00089 (interface.go:19) MOVQ 24(SP), CX
It can be seen that the compiler calls the conversion function after constructing itab runtime.convT2I(SB)
, see the implementation of the function
//runtime/iface.go
func convT2I(tab *itab, elem unsafe.Pointer) (i iface) {
t := tab._type
if raceenabled {
raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2I))
}
if msanenabled {
msanread(elem, t.size)
}
x := mallocgc(t.size, t, true)
typedmemmove(t, x, elem)
i.tab = tab
i.data = x
return
}
mallocgc
First, call to apply for a piece of memory space according to the size of the type elem
, copy the content of the pointer to the new space, assign the tab to the tab of iface, and assign the new memory pointer to the data of iface, so that an iface is created
Slightly change the sample code to assign a variable of structure pointer type to an interface variable
19 r = &Person{
Name: "song_chh"}
Generate assembly code again through the tool
go tool compile -S interface.go
View the following assembly code
0x001d 00029 (interface.go:19) PCDATA $2, $1
0x001d 00029 (interface.go:19) PCDATA $0, $0
0x001d 00029 (interface.go:19) LEAQ type."".Person(SB), AX
0x0024 00036 (interface.go:19) PCDATA $2, $0
0x0024 00036 (interface.go:19) MOVQ AX, (SP)
0x0028 00040 (interface.go:19) CALL runtime.newobject(SB)
0x002d 00045 (interface.go:19) PCDATA $2, $2
0x002d 00045 (interface.go:19) MOVQ 8(SP), DI
0x0032 00050 (interface.go:19) MOVQ $8, 8(DI)
0x003a 00058 (interface.go:19) PCDATA $2, $-2
0x003a 00058 (interface.go:19) PCDATA $0, $-2
0x003a 00058 (interface.go:19) CMPL runtime.writeBarrier(SB), $0
0x0041 00065 (interface.go:19) JNE 105
0x0043 00067 (interface.go:19) LEAQ go.string."song_chh"(SB), AX
0x004a 00074 (interface.go:19) MOVQ AX, (DI)
First, the compiler obtains Person
the structure type pointer, calls runtime.newobject()
the function as a parameter, and also checks the function definition in the source code
// runtime/malloc.go
// implementation of new builtin
// compiler (both frontend and SSA backend) knows the signature
// of this function
func newobject(typ *_type) unsafe.Pointer {
return mallocgc(typ.size, typ, true)
}
newobject takes *Person as an input parameter, creates a new Person
structure pointer, and sets its variables, and then the compiler generates iface
In addition convT2I
to functions, in fact runtime/runtime.go
, there are many definitions of conversion functions in the file
// Non-empty-interface to non-empty-interface conversion.
func convI2I(typ *byte, elem any) (ret any)
// Specialized type-to-interface conversion.
// These return only a data pointer.
func convT16(val any) unsafe.Pointer // val must be uint16-like (same size and alignment as a uint16)
func convT32(val any) unsafe.Pointer // val must be uint32-like (same size and alignment as a uint32)
func convT64(val any) unsafe.Pointer // val must be uint64-like (same size and alignment as a uint64 and contains no pointers)
func convTstring(val any) unsafe.Pointer // val must be a string
func convTslice(val any) unsafe.Pointer // val must be a slice
// Type to empty-interface conversion.
func convT2E(typ *byte, elem *any) (ret any)
func convT2Enoptr(typ *byte, elem *any) (ret any)
// Type to non-empty-interface conversion.
func convT2I(tab *byte, elem *any) (ret any) //for the general case
func convT2Inoptr(tab *byte, elem *any) (ret any) //for structs that do not contain pointers
convT2Inoptr
It is used for the conversion without pointer inside the structure. noptr can be understood as no pointer. The conversion process is similar convT2I
to
that ofconvT16
convT32
convT64
convTstring
convTslice
convT64
//runtime/iface.go
func convT64(val uint64) (x unsafe.Pointer) {
if val == 0 {
x = unsafe.Pointer(&zeroVal[0])
} else {
x = mallocgc(8, uint64Type, false)
*(*uint64)(x) = val
}
return
}
Compared with convT2
a series of functions, it lacks calls typedmemmove
to memmove
functions and reduces memory copying. In addition, if the value is a zero value of this type, it will not call mallocgc
to apply for a new memory, and directly return the pointed zeroVal[0]
pointer
Let's look at the empty interface conversion functionconvT2E
func convT2E(t *_type, elem unsafe.Pointer) (e eface) {
if raceenabled {
raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2E))
}
if msanenabled {
msanread(elem, t.size)
}
x := mallocgc(t.size, t, true)
// TODO: We allocate a zeroed object only to overwrite it with actual data.
// Figure out how to avoid zeroing. Also below in convT2Eslice, convT2I, convT2Islice.
typedmemmove(t, x, elem)
e._type = t
e.data = x
return
}
convT2E
convT2I
Similar to , *_type
it is also generated by the compiler when converted to eface, and is called as an input parameterconvT2E
3.4 Affirmations
The content of the previous section mainly introduces how to convert a concrete type into an interface type, so how to convert a concrete type into an interface type? The Go language provides two methods, namely 类型断言
and类型分支
type assertion
There are two ways to write type assertions
v := x.(T)
v, ok := x.(T)
- x: is an expression of interface type
- T: is a known type
Pay attention to the first way of writing, if the type assertion fails, painc will be triggered
type switch
switch x := x.(type) {
/* ... */}
Example of use
switch i.(type) {
case string:
fmt.Println("i'm a string")
case int:
fmt.Println("i'm a int")
default:
fmt.Println("unknown")
}
4. References
【1】 "Go Programming Language" Machinery Industry Press
[2] "Analysis of the bottom layer of interface in golang"
[3] "On the Principles of Go Language Implementation"
[4] "Deep Deciphering Go Language 10 Questions About Interface"