First, what is a memory leak
Concurrency is goroutines (independent events) and Channel in Go (for communication) is implemented in the form. When processing goroutine, programmers need to carefully avoid leakage. If the final is always jam on I / O (eg channel communication), or an endless loop, then goroutine leak occurs. Even if it is blocked goroutine, it will consume resources, therefore, the program may use more memory than are actually needed, or eventually run out of memory, resulting in a crash. Although we know goroutine will be assigned a 2kb stack address space during initialization, (the issue of memory allocation can refer http://ifeve.com/memory-barriers-or-fences/) However, if a large number of goroutine blocked , resulting in a waste of memory is also an objective.
We know gc go mark is used in the recovery process:
through all the referenced objects from the root variables, clear operation After labeling, unlabeled objects for recycling.
Blocking state is a state to be a wake-go schedule, it can not be the gc.
We know:
as time channel to transmit data, which will remain blocked until another goroutine goroutine accept data of the channel, and vice versa, when goroutine to accept the data channel has been blocked channel will transmit data up to another goroutine
Inside the sender of the message channel is full, then the sender will know the consumer is blocked
ch := make(chan int) go func() { ch <- 1 fmt.Println(111) }()
When the receiving end consumer channel was found to be empty
ch := make(chan int, 1) go func() { <-ch fmt.Println(111) }()
The above-mentioned two states is the simplest case goroutine blocked, we encounter goroutine obstruction, which has led to nothing more than a memory leak case.
How memory leak is to generate it?
1, a transmission channel without the recipient
func query() int { n := rand.Intn(100) time.Sleep(time.Duration(n) * time.Millisecond) return n } func queryAll() int { ch := make(chan int) go func() { ch <- query() }() go func() { ch <- query() }() go func() { ch <- query() }() return <-ch } func main() { for i := 0; i < 4; i++ { queryAll() fmt.Printf("#goroutines: %d", runtime.NumGoroutine()) } }
Output:
#goroutines: 3 #goroutines: 5 #goroutines: 7 #goroutines: 9
After each queryAll call number goroutine growth will occur. The problem is that, after receiving the first response, "slow" goroutines channel will be sent to the other end is not the recipient.
2、nil channel
Write to nil channel will be blocked forever
package main func main() { var ch chan struct{} ch <- struct{}{} }
It leads to a deadlock:
fatal error: all goroutines are asleep - deadlock! goroutine 1 [chan send (nil chan)]: main.main() ...
When reading data from nil channel, the same thing happens:
var Chan struct {CH}
<-CH
channel transfer has not been initialized, it may happen
func main() { var ch chan int if false { ch = make(chan int, 1) ch <- 1 } go func(ch chan int) { <-ch }(ch) c := time.Tick(1 * time.Second) for range c { fmt.Printf("#goroutines: %d", runtime.NumGoroutine()) } }
3, channel communications timeout
If goroutine at the time of communication, the sender of the channel for some reason did not goroutine reach the consumer side, the lower consumption of goroutine will be a long time in blocked state waiting for the wake-up message.
/ * Check channel read out, and to make a timeout processing * / FUNC The testTimeout () { G: = the make (Chan int) quit: = the make (Chan BOOL) Go FUNC () { for { SELECT { Case V: = < -g: fmt.Println (v) Case <-time.After (time.Second * time.Duration (3)): quit <- to true fmt.Println ( "time-out, notify the main thread exits") return } } } ( ) for I: = 0; I <. 3; I ++ { G <- I } <-QUIT fmt.Println ( "exit notification is received, the main thread exit") }
Second, how to troubleshoot memory leaks
We can use to troubleshoot pprof
Then we come to understand the basics of points under pprof
What is pprof
Go is pprof performance analysis tool, the program is running, the program may be recorded in the operation information may be, when it is desired positioning or tuning Bug when the information recording CPU usage, memory usage, and other operation goroutines It is very important.
Basic use
There are several ways to use pprof, Go have a good one ready-made package: net / http / pprof, using a few simple command lines, you can open pprof, record operating information and provides Web services through the browser and command acquiring operating data lines in two ways.
import ( "fmt" "net/http" _ "net/http/pprof" ) func main() { // 开启pprof,监听请求 ip := "127.0.0.1:6060" if err := http.ListenAndServe(ip, nil); err != nil { fmt.Printf("start pprof failed on %s\n", ip) } }
We enter the ip: port / debug / pprof / open pprof home
such as my address
http://127.0.0.1:6060/debug/pprof/
See the following information
Let's analyze the meaning of the specific parameters of the above
allocs: A sampling of All Past Memory Allocations sampling all past memory allocation Block: Stack traces that LED to blocking ON Synchronization not primitives or cause synchronization primitives blocked stack trace cmdline: at The the Command Line Invocation of at The Current Program current command program line calls goroutine: Stack traces of All Current goroutines heap: . a sampling of Live objects memory Allocations of the Specify by You at the gc CAN GET RUN GC to the before the Parameter Taking the sample at the heap. sampled active object memory allocation. Before obtaining a sample stack, you can specify parameters gc get to run gc. (That is, heap memory information) mutex: Stack traces of Holders contended mutexes of the stack trace contention mutex holder (information lock) Profile: Profile the CPU. DURATION by You at The CAN in the Specify the Parameter at The seconds The GET. GET the After you at The Profile File, use the Command pprof at The Tool Go to the Investigate at The Profile. The CPU configuration file. You can specify the duration in seconds get parameter. After obtaining the configuration file, use the command go tool pprof survey profiles. threadcreate: Stack traces that LED to at The Creation of new new OS Threads lead to create a stack trace (message threads) new operating system threads the trace: . A the trace of Execution of at The Current Program by You CAN the Specify at The DURATION in at The seconds The GET the Parameter the After. you get the trace file, use the go tool trace command to investigate the trace. to track the current program execution. You can specify the duration in seconds get parameter. After obtaining trace files, use the command go tool trace tracing investigation.
Command line
When connected to a terminal, no browser can be used, provided Go command line, type information can be acquired more than 5, more convenient to use in this way.
Use the command go tool pprof url can obtain the specified profile file, this command will initiate a http request, and then download the data to a local, after entering the interactive mode, just as gdb, you can use the command to check information, the following are five types of requests the way:
# 下载cpu profile,默认从当前开始收集30s的cpu使用情况,需要等待30s go tool pprof http://localhost:6060/debug/pprof/profile # 30-second CPU profile go tool pprof http://localhost:6060/debug/pprof/profile?seconds=120 # wait 120s # 下载heap profile go tool pprof http://localhost:6060/debug/pprof/heap # heap profile # 下载goroutine profile go tool pprof http://localhost:6060/debug/pprof/goroutine # goroutine profile # 下载block profile go tool pprof http://localhost:6060/debug/pprof/block # goroutine blocking profile # 下载mutex profile go tool pprof http://localhost:6060/debug/pprof/mutex
Memory leak found
If you are using the cloud platform deployment Go program, cloud platforms provide a memory tool to view, you can see the OS memory usage and memory usage of a process, such as Ali cloud, we only deployed a host on a cloud Go service, so the memory footprint of the OS, basically reflects the process memory usage, memory usage as OS, you can see that as time progresses, memory usage continues to increase, it is the most memory leaks obvious phenomenon:
Discover how to use heap memory problems
using heap pprof can get memory information program is running, in the case of the smooth running of the program, each period of time to obtain the use of heap memory profile, and then be able to use different base of comparison of the two profile files, like the diff command to show the increase and decrease of the same changes, using a simple demo to illustrate the use of heap and base, using demo2 still on display.
// show memory growth and pprof, not leak Package Penalty for main Import ( "fmt" "NET / HTTP" _ "NET / HTTP / pprof" "os" "Time" ) // running for some time: fatal error: runtime: out Memory of FUNC main () { // open pprof, Go FUNC () { IP: = "0.0.0.0:6060" IF ERR: = http.ListenAndServe (IP, nil); ERR = nil {! fmt.Printf ( "Start % S ON failed pprof, \ n-", IP) os.Exit (. 1) } } () the tick: = time.Tick (time.Second / 100) var buf [] byte for the tick Range { buf = the append (buf, the make ([] byte, 1024 * 1024) ...) } }
The above code is up and running, execute the following command to get the profile document, 1 minute, then get one time.
go tool pprof http: // localhost: 6060 / debug / pprof / heap
I had to get the two profile files:
Administrator@SC-201807230940 MINGW64 ~/pprof $ ls pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
The base used as a reference file 001, 002 and 001 and then compared to see the comparison performed top of the top, and then list the main memory execution List Comparative main function, as follows:
Administrator@SC-201807230940 MINGW64 ~/pprof $ go tool pprof -base pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
result
(pprof) top Showing nodes accounting for 1.04GB, 50.58% of 2.06GB total flat flat% sum% cum cum% 1.04GB 50.58% 50.58% 1.04GB 50.58% main.main 0 0% 50.58% 1.04GB 50.58% runtime.main (pprof)
Use traces
Type: inuse_space Time: Jul 29, 2019 at 8:48am (CST) -----------+------------------------------------------------------- bytes: 1.55GB 1.55GB main.main runtime.main -----------+------------------------------------------------------- bytes: 1.24GB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 1016.83MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 813.46MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 902.59kB 0 compress/flate.NewWriter compress/gzip.(*Writer).Write runtime/pprof.(*profileBuilder).build runtime/pprof.writeHeapProto runtime/pprof.writeHeap runtime/pprof.(*Profile).WriteTo net/http/pprof.handler.ServeHTTP net/http/pprof.Index net/http.HandlerFunc.ServeHTTP net/http.(*ServeMux).ServeHTTP net/http.serverHandler.ServeHTTP net/http.(*conn).serve -----------+------------------------------------------------------- bytes: 650.77MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 520.61MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 416.48MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 333.19MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 266.55MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 213.23MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 170.59MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 136.47MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 109.17MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 87.34MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 69.87MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 55.89MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 44.71MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 35.77MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 28.61MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 22.88MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 18.30MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 14.64MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 11.71MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 9.37MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 7.49MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 5.99MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 4.79MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 3.07MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 2.46MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 1.16MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 1MB 1.16MB main.main runtime.main -----------+------------------------------------------------------- bytes: 520.61MB -520.61MB main.main runtime.main -----------+------------------------------------------------------- bytes: 416.48MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 333.19MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 266.55MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 213.23MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 170.59MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 136.47MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 109.17MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 87.34MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 69.87MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 55.89MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 44.71MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 35.77MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 28.61MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 22.88MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 18.30MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 14.64MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 11.71MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 9.37MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 7.49MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 5.99MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 4.79MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 3.07MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 2.46MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 1.16MB 0 main.main runtime.main -----------+------------------------------------------------------- bytes: 1MB -1.16MB main.main runtime.main -----------+------------------------------------------------------- (pprof)
Use the list
(pprof) list main.main Total: 2.06GB ROUTINE ======================== main.main in D:\gowork\src\study\main\main.go 1.04GB 1.04GB (flat, cum) 50.58% of Total . . 57: }() . . 58: . . 59: tick := time.Tick(time.Second / 100) . . 60: var buf []byte . . 61: for range tick { 1.04GB 1.04GB 62: buf = append(buf, make([]byte, 1024*1024)...) . . 63: } . . 64:} . . 65: . . 66: . . 67: (pprof)
heap "can not" locate memory leaks
heap memory allocation can be displayed, and which lines of code how much memory, we could easily find a place to take up the most memory, if the value of this place is constantly how large, which can be found here It is the location of memory leaks.
Wanted to find what you want, from the position of memory leaks, according to the call stack look up, always find the cause of memory leaks, this approach appears to be good, but implementation can not find the cause of memory leaks, the result is much more effective.
The reason is that a Go program, including a large number of goroutine, which calls the relationship may be a bit complicated, perhaps a memory leak in a three-way bag. For chestnut, this figure such as the following, each of the ovals represent a goroutines, wherein the digital call-number relationship, arrows represent. heap profile display g111 (bottom marked red node) this coroutine code appeared leaks, any one call path from g101 to g111 are likely to cause a memory leak g111, there are two types possible:
the goroutine invoked only a few times, but consume a lot of memory, a description of each goroutine call consumes a lot of memory, reason, memory leaks in the Association of basic internal processes.
The number of calls goroutine very much, although much of each process invocation Association in memory consumption, but the call path, a huge number of coroutines, resulting consume a lot of memory, and these goroutine for some reason can not quit, take up the memory is not released, memory leaks reason to call on g111 path to achieve a piece of code problems, causing create a large number of g111.
Case 2 is goroutine leaks, which can not be found through the heap, so the heap in locating memory leaks on this matter, not to play a role.
Memory leak investigation
Web visualization view
Web-suited web server situation ports accessible, convenient to use, there are 2 ways:
View the call path, a section, currently blocked number in this goroutine of
view runtime stack all goroutine of (call path), you can show blocked in this time
import ( "fmt" "net/http" _ "net/http/pprof" "os" "time" ) func main() { // 开启pprof go func() { ip := "0.0.0.0:6060" if err := http.ListenAndServe(ip, nil); err != nil { fmt.Printf("start pprof failed on %s\n", ip) os.Exit(1) } }() outCh := make(chan int) for i := 1; i <= 5; i++ { go func() { outCh <- 1 }() time.Sleep(time.Second) } ///value := <-outCh //fmt.Println("value : ", value) //time time.Sleep(100 * time.Second) }
A mode
request url provided debug = 1:
Use http://127.0.0.1:6060/debug/pprof/goroutine?debug=1
We can clearly see that there are five goroutine is blocked
The fact is that five local primary goroutine time should also be blocked out
We see five goroutine blocked off with a resource and points to a block of code is 23 lines, then we will be able to quickly locate the investigation.
Second way
url request set debug = 2:
Use: http: //127.0.0.1:? 6060 / debug / pprof / goroutine debug = 2
We can see the time blocking, but also to see the code block blocked
Interactive command line method
top lists five statistics:
Flat: The amount of memory occupied by this function.
flat%: This memory function of the percentage of total memory used.
sum%: percentage of the foregoing flat and each row, such as row 2, although the 100% + 100% 0%.
cum: the cumulative amount is added to the main function calls the function f, f a function of the amount of memory footprint, will come in mind.
cum%: cumulative percentage of the total amount.
list
view a function of the code, and index information of the function of each line of code, if the function name is not clear, it will be fuzzy matching, such as list main lists main.main and runtime.main.
traces
print all the call stack, and index information call stack.
The following is a specific troubleshooting procedure
1, using top
$ go tool pprof http://0.0.0.0:6060/debug/pprof/goroutine Fetching profile over HTTP from http://0.0.0.0:6060/debug/pprof/goroutine Saved profile in C:\Users\Administrator\pprof\pprof.goroutine.006.pb.gz Type: goroutine Time: Jul 29, 2019 at 8:06am (CST) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Unrecognized command: "\x1b[A\x1b[Btop" (pprof) top Showing nodes accounting for 9, 100% of 9 total Showing top 10 nodes out of 32 flat flat% sum% cum cum% 7 77.78% 77.78% 7 77.78% runtime.gopark 1 11.11% 88.89% 1 11.11% net/http.(*connReader).backgroundRead 1 11.11% 100% 1 11.11% runtime/pprof.writeRuntimeProfile 0 0% 100% 1 11.11% internal/poll.(*FD).Accept 0 0% 100% 1 11.11% internal/poll.(*FD).acceptOne 0 0% 100% 1 11.11% internal/poll.(*ioSrv).ExecIO 0 0% 100% 1 11.11% internal/poll.(*pollDesc).wait 0 0% 100% 1 11.11% internal/poll.runtime_pollWait 0 0% 100% 1 11.11% main.main 0 0% 100% 1 11.11% main.main.func1 (pprof)
We can see that there were seven blocked
We print the detailed call link through traces
(pprof) traces Type: goroutine Time: Jul 29, 2019 at 8:06am (CST) -----------+------------------------------------------------------- 5 runtime.gopark runtime.goparkunlock runtime.chansend runtime.chansend1 main.main.func2 -----------+------------------------------------------------------- 1 runtime.gopark runtime.netpollblock internal/poll.runtime_pollWait internal/poll.(*pollDesc).wait internal/poll.(*ioSrv).ExecIO internal/poll.(*FD).acceptOne internal/poll.(*FD).Accept net.(*netFD).accept net.(*TCPListener).accept net.(*TCPListener).AcceptTCP net/http.tcpKeepAliveListener.Accept net/http.(*Server).Serve net/http.(*Server).ListenAndServe net/http.ListenAndServe main.main.func1 -----------+------------------------------------------------------- 1 runtime.gopark runtime.goparkunlock time.Sleep main.main runtime.main -----------+------------------------------------------------------- 1 net/http.(*connReader).backgroundRead -----------+------------------------------------------------------- 1 runtime/pprof.writeRuntimeProfile runtime/pprof.writeGoroutine runtime/pprof.(*Profile).WriteTo net/http/pprof.handler.ServeHTTP net/http/pprof.Index net/http.HandlerFunc.ServeHTTP net/http.(*ServeMux).ServeHTTP net/http.serverHandler.ServeHTTP net/http.(*conn).serve -----------+------------------------------------------------------- (pprof)
We can see five blocked to main.main.func2
Then we can use the list to view specific code of obstruction
(pprof) list main.main.func2 Total: 9 ROUTINE ======================== main.main.func2 in D:\gowork\src\study\main\main.go 0 5 (flat, cum) 55.56% of Total . . 29: } . . 30: }() . . 31: outCh := make(chan int) . . 32: for i := 1; i <= 5; i++ { . . 33: go func() { . 5 34: outCh <- 1 . . 35: }() . . 36: time.Sleep(time.Second) . . 37: } . . 38: . . 39: ///value := <-outCh (pprof)
We can see that our code chokes to print out, we can be very good for the investigation.