Reprinted from: https://my.oschina.net/solate/blog/3034188
For personal backup only, please see the original text for browsing
table of Contents
Benchmarks
The benchmark test mainly evaluates the performance of the code under test by testing the efficiency of CPU and memory, and then finds a better solution.
Write benchmarks
func BenchmarkSprintf(b *testing.B){
num:=10
b.ResetTimer()
for i:=0;i<b.N;i++{
fmt.Sprintf("%d",num)
}
}
- The code file of the benchmark test must end with _test.go
- The benchmark function must start with Benchmark and must be exportable
- The benchmark function must accept a pointer to the Benchmark type as the only parameter
- Benchmark functions cannot have return values
- b.ResetTimer is to reset the timer, which can avoid the interference of the initialization code before the for loop
- The final for loop is very important, the code to be tested should be placed in the loop
- bN is provided by the benchmark test framework, which represents the number of cycles, because the test code needs to be called repeatedly to evaluate the performance
➜ go test -bench=. -run=none
BenchmarkSprintf-8 20000000 117 ns/op
PASS
ok flysnow.org/hello 2.474s
Use a go test
command, add a -bench=
mark, and accept an expression as a parameter, which .
means to run all benchmarks
Because go test
unit tests are run by default , in order to prevent the output of the unit test from affecting the results of the benchmark test, we can use -run=
a unit test method that has never been matched to filter out the output of the unit test. We use it here none
because we basically don’t A unit test method with this name will be created.
It can also be used -run=^$
, it matches this rule, but it doesn’t, so only the benchmark will be run
go test -bench=. -run=^$
Sometimes we need to do some preparatory work before benchmarking, and we don’t want these preparatory work to be included in the timing, we can use b.ResetTimer(), which means reset the timing to 0, and use the time of the call as the start of the re-timing .
Do you see the one behind the function -8
? This represents the value of GOMAXPROCS corresponding to runtime.
The next 20000000
represents the number of times the for loop is run, that is, the number of times the code under test is called
The final 117 ns/op
representation requires 117 nanoseconds of call charge each time. (Time to perform an operation call charge)
The above is that the default test time is 1 second, which is 1 second. It is called 20 million times, and each call takes 117 nanoseconds.
If you want the test to run longer, you can specify it with -benchtime, such as 3 seconds.
➜ hello go test -bench=. -benchtime=3s -run=none
// Benchmark 名字 - CPU 循环次数 平均每次执行时间
BenchmarkSprintf-8 50000000 109 ns/op
PASS
// 哪个目录下执行go test 累计耗时
ok flysnow.org/hello 5.628s
It can be found that we have lengthened the test time and the number of tests has increased, but the final performance result: the time of each execution has not changed much. Generally speaking, this value is best not to exceed 3 seconds, which is of little significance.
Performance comparison
The benchmark test example above is actually an example of converting an int type to a string type. There are several methods in the standard library. Let's see which one has better performance.
func BenchmarkSprintf(b *testing.B){
num:=10
b.ResetTimer()
for i:=0;i<b.N;i++{
fmt.Sprintf("%d",num)
}
}
func BenchmarkFormat(b *testing.B){
num:=int64(10)
b.ResetTimer()
for i:=0;i<b.N;i++{
strconv.FormatInt(num,10)
}
}
func BenchmarkItoa(b *testing.B){
num:=10
b.ResetTimer()
for i:=0;i<b.N;i++{
strconv.Itoa(num)
}
}
➜ hello go test -bench=. -run=none
BenchmarkSprintf-8 20000000 117 ns/op
BenchmarkFormat-8 50000000 33.3 ns/op
BenchmarkItoa-8 50000000 34.9 ns/op
PASS
ok flysnow.org/hello 5.951s
From the results point of view strconv.FormatInt
function is the fastest, followed by strconv.Itoa
, then the fmt.Sprintf
slowest, the first two functions performance to last more than three times. So why is the last one so slow, we will go through to -benchmem
find the root cause.
➜ hello go test -bench=. -benchmem -run=none
BenchmarkSprintf-8 20000000 110 ns/op 16 B/op 2 allocs/op
BenchmarkFormat-8 50000000 31.0 ns/op 2 B/op 1 allocs/op
BenchmarkItoa-8 50000000 33.1 ns/op 2 B/op 1 allocs/op
PASS
ok flysnow.org/hello 5.610s
-benchmem
You can provide the number of times memory is allocated for each operation and the number of bytes allocated for each operation. From the results, we can see that for the two high-performance functions, memory is allocated once for each operation, while the slowest one has to be allocated twice; the high-performance functions allocate 2 bytes of memory for each operation, but the slowest That function needs to allocate 16 bytes of memory each time. From this data, we know why it is so slow, and the memory allocation is too high.
In code development, it is very important to write benchmark tests where we require performance, which helps us develop better-performing code. However, there must be a relative trade-off between performance, usability, and reusability, and not over-optimized for the pursuit of performance.
Combined with pprof
pprof performance monitoring
package bench
import "testing"
func Fib(n int) int {
if n < 2 {
return n
}
return Fib(n-1) + Fib(n-2)
}
func BenchmarkFib10(b *testing.B) {
// run the Fib function b.N times
for n := 0; n < b.N; n++ {
Fib(10)
}
}
go test -bench=. -benchmem -cpuprofile profile.out
You can also watch the memory at the same time
go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out
Then you can use pprof with the output file
go tool pprof profile.out
File: bench.test
Type: cpu
Time: Apr 5, 2018 at 4:27pm (EDT)
Duration: 2s, Total samples = 1.85s (92.40%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 1.85s, 100% of 1.85s total
flat flat% sum% cum cum%
1.85s 100% 100% 1.85s 100% bench.Fib
0 0% 100% 1.85s 100% bench.BenchmarkFib10
0 0% 100% 1.85s 100% testing.(*B).launch
0 0% 100% 1.85s 100% testing.(*B).runN
This is to use the cpu file, you can also use the memory file
Then you can also use the list command to check the time required by the function
(pprof) list Fib
1.84s 2.75s (flat, cum) 148.65% of Total
. . 1:package bench
. . 2:
. . 3:import "testing"
. . 4:
530ms 530ms 5:func Fib(n int) int {
260ms 260ms 6: if n < 2 {
130ms 130ms 7: return n
. . 8: }
920ms 1.83s 9: return Fib(n-1) + Fib(n-2)
. . 10:}
Or use web commands to generate images (png, pdf,...)
报错:Failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in %PATH%
It's caused by not installing gvedit on your computer
fq enter gvedit official website https://graphviz.gitlab.io/_pages/Download/Download_windows.html to download the stable version
Mac installation, after installation, you can use the web to display
brew install graphviz
Flame graph
Flame Graph is a performance analysis chart created by Bredan Gregg, named because it looks like a flame.
The flame graph svg file can be opened through the browser. Its best advantage for the call graph is that it is dynamic: you can click on each square to zoom in and analyze the content on it.
The calling sequence of the flame graph is from bottom to top, each square represents a function, and the layer above it represents which functions the function will call, and the size of the square represents the length of CPU usage. The color scheme of the flame graph has no special meaning. The default red and yellow color schemes are just to be more like flames.
The runtime/pprof analysis project will export the profile file in the current folder. Then use the flame graph to analyze, you can't specify the domain name, you need to specify the file.
go-torch
Most of the open source tools that use uber are introduced online
go-torch . This is an uber open source tool that can directly read golang profiling data and generate a flame graph svg file.
The go-torch tool is very simple to use. Without any parameters, it will try to get profiling data from http://localhost:8080/debug/pprof/profile . It has three commonly used parameters that can be adjusted:
- -u --url: URL to be accessed, here is just the host and port part
- -s --suffix: path of pprof profile, default is /debug/pprof/profile
- --seconds: The length of time to perform profiling, the default is 30s
Native support
Starting from Go 1.11, the flame graph has been integrated into the official Go pprof library.
# This will listen on :8081 and open a browser.
# Change :8081 to a port of your choice.
$ go tool pprof -http=":8081" [binary] [profile]
If it is lower than 1.11 version then please download from git pprof
# Get the pprof tool directly
$ go get -u github.com/google/pprof
$ pprof -http=":8081" [binary] [profile]
A small web example
package main
import (
"fmt"
"log"
"net/http"
_ "net/http/pprof"
"time"
)
func sayHelloHandler(w http.ResponseWriter, r *http.Request) {
hellowold(10000)
fmt.Println("path", r.URL.Path)
fmt.Println("scheme", r.URL.Scheme)
fmt.Fprintf(w, "Hello world!\n") //这个写入到w的是输出到客户端的
}
func main() {
http.HandleFunc("/", sayHelloHandler) // 设置访问路由
log.Fatal(http.ListenAndServe(":8080", nil))
}
func hellowold(times int) {
time.Sleep(time.Second)
var counter int
for i := 0; i < times; i++ {
for j := 0; j < times; j++ {
counter++
}
}
}
Use the following command to turn on monitoring, and then visit localhost:8080 several times
go tool pprof -http=":8081" http://localhost:8080/debug/pprof/profile
After a while, a web window will be generated, select VIEW->Flame Graph to get the flame graph
http://localhost:8081/ui/flamegraph
Testing flags
What parameters can be followed by the go test
Commonly used flag
- -bench regexp: performance test, support expressions to filter test functions. -bench. is to test all benchmark functions
- -benchmem: Display the statistics of the memory allocation of the test function during performance testing
- -Count n: how many tests and performance are run, the default is once
- -run regexp: Only run specific test functions, for example -run ABC only tests test functions that contain ABC in the function name
- -timeout t: If the test time exceeds t, panic, the default is 10 minutes
- -v: Display the detailed information of the test, and also display the log of the Log and Logf method
Go 1.7 began to support the concept of sub-test .
Test attention and tuning
Golang performance testing and tuning
- Avoid calling timer frequently
- Avoid excessive test data