Go performance test-Benchmark

Reprinted from: https://my.oschina.net/solate/blog/3034188

For personal backup only, please see the original text for browsing

table of Contents

Benchmarks

Write benchmarks

Performance comparison

Combined with pprof

Flame graph

Testing flags

Test attention and tuning

Benchmarks

The benchmark test mainly evaluates the performance of the code under test by testing the efficiency of CPU and memory, and then finds a better solution.

Write benchmarks

func BenchmarkSprintf(b *testing.B){
	num:=10
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		fmt.Sprintf("%d",num)
	}
}

The code file of the benchmark test must end with _test.go
The benchmark function must start with Benchmark and must be exportable
The benchmark function must accept a pointer to the Benchmark type as the only parameter
Benchmark functions cannot have return values
b.ResetTimer is to reset the timer, which can avoid the interference of the initialization code before the for loop
The final for loop is very important, the code to be tested should be placed in the loop
bN is provided by the benchmark test framework, which represents the number of cycles, because the test code needs to be called repeatedly to evaluate the performance

➜  go test -bench=. -run=none
BenchmarkSprintf-8      20000000               117 ns/op
PASS
ok      flysnow.org/hello       2.474s

Use a go test command, add a -bench= mark, and accept an expression as a parameter, which .means to run all benchmarks

Because go test unit tests are run by default , in order to prevent the output of the unit test from affecting the results of the benchmark test, we can use -run=a unit test method that has never been matched to filter out the output of the unit test. We use it here nonebecause we basically don’t A unit test method with this name will be created.

It can also be used -run=^$, it matches this rule, but it doesn’t, so only the benchmark will be run

go test -bench=. -run=^$

Sometimes we need to do some preparatory work before benchmarking, and we don’t want these preparatory work to be included in the timing, we can use b.ResetTimer(), which means reset the timing to 0, and use the time of the call as the start of the re-timing .

Do you see the one behind the function -8? This represents the value of GOMAXPROCS corresponding to runtime.

The next 20000000represents the number of times the for loop is run, that is, the number of times the code under test is called

The final 117 ns/oprepresentation requires 117 nanoseconds of call charge each time. (Time to perform an operation call charge)

The above is that the default test time is 1 second, which is 1 second. It is called 20 million times, and each call takes 117 nanoseconds.

If you want the test to run longer, you can specify it with -benchtime, such as 3 seconds.

➜  hello go test -bench=. -benchtime=3s -run=none
// Benchmark 名字 - CPU     循环次数          平均每次执行时间 
BenchmarkSprintf-8      50000000               109 ns/op
PASS
//  哪个目录下执行go test         累计耗时
ok      flysnow.org/hello       5.628s

It can be found that we have lengthened the test time and the number of tests has increased, but the final performance result: the time of each execution has not changed much. Generally speaking, this value is best not to exceed 3 seconds, which is of little significance.

Performance comparison

The benchmark test example above is actually an example of converting an int type to a string type. There are several methods in the standard library. Let's see which one has better performance.

func BenchmarkSprintf(b *testing.B){
	num:=10
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		fmt.Sprintf("%d",num)
	}
}

func BenchmarkFormat(b *testing.B){
	num:=int64(10)
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		strconv.FormatInt(num,10)
	}
}

func BenchmarkItoa(b *testing.B){
	num:=10
	b.ResetTimer()
	for i:=0;i<b.N;i++{
		strconv.Itoa(num)
	}
}

➜  hello go test -bench=. -run=none              
BenchmarkSprintf-8      20000000               117 ns/op
BenchmarkFormat-8       50000000                33.3 ns/op
BenchmarkItoa-8         50000000                34.9 ns/op
PASS
ok      flysnow.org/hello       5.951s

From the results point of view strconv.FormatIntfunction is the fastest, followed by strconv.Itoa, then the fmt.Sprintfslowest, the first two functions performance to last more than three times. So why is the last one so slow, we will go through to -benchmemfind the root cause.

➜  hello go test -bench=. -benchmem -run=none
BenchmarkSprintf-8      20000000               110 ns/op              16 B/op          2 allocs/op
BenchmarkFormat-8       50000000                31.0 ns/op             2 B/op          1 allocs/op
BenchmarkItoa-8         50000000                33.1 ns/op             2 B/op          1 allocs/op
PASS
ok      flysnow.org/hello       5.610s

-benchmemYou can provide the number of times memory is allocated for each operation and the number of bytes allocated for each operation. From the results, we can see that for the two high-performance functions, memory is allocated once for each operation, while the slowest one has to be allocated twice; the high-performance functions allocate 2 bytes of memory for each operation, but the slowest That function needs to allocate 16 bytes of memory each time. From this data, we know why it is so slow, and the memory allocation is too high.

In code development, it is very important to write benchmark tests where we require performance, which helps us develop better-performing code. However, there must be a relative trade-off between performance, usability, and reusability, and not over-optimized for the pursuit of performance.

Combined with pprof

pprof performance monitoring

package bench
import "testing"
func Fib(n int) int {
    if n < 2 {
      return n
    }
    return Fib(n-1) + Fib(n-2)
}
func BenchmarkFib10(b *testing.B) {
    // run the Fib function b.N times
    for n := 0; n < b.N; n++ {
      Fib(10)
    }
}

go test -bench=. -benchmem -cpuprofile profile.out

You can also watch the memory at the same time

go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out

Then you can use pprof with the output file

go tool pprof profile.out
File: bench.test
Type: cpu
Time: Apr 5, 2018 at 4:27pm (EDT)
Duration: 2s, Total samples = 1.85s (92.40%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 1.85s, 100% of 1.85s total
      flat  flat%   sum%        cum   cum%
     1.85s   100%   100%      1.85s   100%  bench.Fib
         0     0%   100%      1.85s   100%  bench.BenchmarkFib10
         0     0%   100%      1.85s   100%  testing.(*B).launch
         0     0%   100%      1.85s   100%  testing.(*B).runN

This is to use the cpu file, you can also use the memory file

Then you can also use the list command to check the time required by the function

(pprof) list Fib
     1.84s      2.75s (flat, cum) 148.65% of Total
         .          .      1:package bench
         .          .      2:
         .          .      3:import "testing"
         .          .      4:
     530ms      530ms      5:func Fib(n int) int {
     260ms      260ms      6:   if n < 2 {
     130ms      130ms      7:           return n
         .          .      8:   }
     920ms      1.83s      9:   return Fib(n-1) + Fib(n-2)
         .          .     10:}

Or use web commands to generate images (png, pdf,...)

web

报错：Failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in %PATH%

It's caused by not installing gvedit on your computer

fq enter gvedit official website https://graphviz.gitlab.io/_pages/Download/Download_windows.html to download the stable version

Mac installation, after installation, you can use the web to display

brew install graphviz

Flame graph

Flame Graph is a performance analysis chart created by Bredan Gregg, named because it looks like a flame.

The flame graph svg file can be opened through the browser. Its best advantage for the call graph is that it is dynamic: you can click on each square to zoom in and analyze the content on it.

The calling sequence of the flame graph is from bottom to top, each square represents a function, and the layer above it represents which functions the function will call, and the size of the square represents the length of CPU usage. The color scheme of the flame graph has no special meaning. The default red and yellow color schemes are just to be more like flames.

The runtime/pprof analysis project will export the profile file in the current folder. Then use the flame graph to analyze, you can't specify the domain name, you need to specify the file.

go-torch

Most of the open source tools that use uber are introduced online

go-torch . This is an uber open source tool that can directly read golang profiling data and generate a flame graph svg file.

The go-torch tool is very simple to use. Without any parameters, it will try to get profiling data from http://localhost:8080/debug/pprof/profile . It has three commonly used parameters that can be adjusted:

-u --url: URL to be accessed, here is just the host and port part
-s --suffix: path of pprof profile, default is /debug/pprof/profile
--seconds: The length of time to perform profiling, the default is 30s

Native support

Starting from Go 1.11, the flame graph has been integrated into the official Go pprof library.

# This will listen on :8081 and open a browser.
# Change :8081 to a port of your choice.
$ go tool pprof -http=":8081" [binary] [profile]

If it is lower than 1.11 version then please download from git pprof

# Get the pprof tool directly
$ go get -u github.com/google/pprof

$ pprof -http=":8081" [binary] [profile]

pprof README.md

A small web example

package main

import (
	"fmt"
	"log"
	"net/http"
	_ "net/http/pprof"
	"time"
)

func sayHelloHandler(w http.ResponseWriter, r *http.Request) {
	hellowold(10000)
	fmt.Println("path", r.URL.Path)
	fmt.Println("scheme", r.URL.Scheme)
	fmt.Fprintf(w, "Hello world!\n") //这个写入到w的是输出到客户端的
}

func main() {
	http.HandleFunc("/", sayHelloHandler) //	设置访问路由
	log.Fatal(http.ListenAndServe(":8080", nil))
}

func hellowold(times int) {
	time.Sleep(time.Second)
	var counter int
	for i := 0; i < times; i++ {
		for j := 0; j < times; j++ {
			counter++
		}
	}
}

Use the following command to turn on monitoring, and then visit localhost:8080 several times

go tool pprof -http=":8081" http://localhost:8080/debug/pprof/profile

After a while, a web window will be generated, select VIEW->Flame Graph to get the flame graph

http://localhost:8081/ui/flamegraph

Testing flags

What parameters can be followed by the go test

Testing flags

Commonly used flag

-bench regexp: performance test, support expressions to filter test functions. -bench. is to test all benchmark functions
-benchmem: Display the statistics of the memory allocation of the test function during performance testing
－Count n: how many tests and performance are run, the default is once
-run regexp: Only run specific test functions, for example -run ABC only tests test functions that contain ABC in the function name
-timeout t: If the test time exceeds t, panic, the default is 10 minutes
-v: Display the detailed information of the test, and also display the log of the Log and Logf method

Go 1.7 began to support the concept of sub-test .

Test attention and tuning

Golang performance testing and tuning

Avoid calling timer frequently
Avoid excessive test data