原文: The Top 10 Most Common Mistakes I’ve Seen in Go Projects
Author: Teiva Harsanyi
Translator: Simon Ma
Ten common mistakes I met in Go development. Order does not matter.
Unknown enumeration value
Let's look at a simple example:
type Status uint32
const (
StatusOpen Status = iota
StatusClosed
StatusUnknown
)
Here, we use iota create an enumeration with the following results:
StatusOpen = 0
StatusClosed = 1
StatusUnknown = 2
Now, let us assume that this Status
type are part of JSON request will be marshalled/unmarshalled
.
We designed the following structure:
type Request struct {
ID int `json:"Id"`
Timestamp int `json:"Timestamp"`
Status Status `json:"Status"`
}
Then, receiving such a request:
{
"Id": 1234,
"Timestamp": 1563362390,
"Status": 0
}
Nothing special here, the state will be unmarshalled
to StatusOpen
.
However, let us another request status value not set an example:
{
"Id": 1235,
"Timestamp": 1563362390
}
In this case, the structure of the request Status
field is initialized to its zero value (for uint32
Type: 0), so the result is StatusOpen
not StatusUnknown
.
Then the best approach is to set the enumeration of unknown value to 0 :
type Status uint32
const (
StatusUnknown Status = iota
StatusOpen
StatusClosed
)
If the state is not part of the JSON request, it will be initialized StatusUnknown
, this in line with our expectations.
Automatic optimization of benchmarks
Benchmark many factors to consider in order to get the correct results.
A common mistake is to be optimized compiler test code between invisible .
Here is teivah/bitvector
an example of a library:
func clear(n uint64, i, j uint8) uint64 {
return (math.MaxUint64<<j | ((1 << i) - 1)) & n
}
This function clears the bit within a given range. To test it, they may do so as follows:
func BenchmarkWrong(b *testing.B) {
for i := 0; i < b.N; i++ {
clear(1221892080809121, 10, 63)
}
}
In this benchmark, clear
we do not call any other function, no side effects . Therefore, the compiler will be clear
optimized to inline. Once inside the Alliance, it will lead to inaccurate test results.
One solution is the result as a function of global variables , as follows:
var result uint64
func BenchmarkCorrect(b *testing.B) {
var r uint64
for i := 0; i < b.N; i++ {
r = clear(1221892080809121, 10, 63)
}
result = r
}
In this way, the compiler will not know clear
whether it will produce side effects.
Therefore, it will not be clear
optimized to inline functions.
Further reading
Transferred pointer
In the function call, variable passed by value creates a copy of the variable, and pass the memory address of the variable by passing a pointer only.
So, passing a pointer is passed by value faster than you? Take a look at this example .
I simulated environment in the local 0.3KB
data, and then were tested for speed and passed by value pointer.
The results show that: the transmission ratio pointer passed by value 4 times or more faster, this is counter-intuitive.
Go test results and how to manage memory-related. Although I can not be like William Kennedy as well explain it, but let me try to summarize what.
Translator's Note start
The authors did not explain the basic storage Go memory, the translator add something.
Here is the description from the Go language of the Bible:
A goroutine will begin its life cycle with a small stack, generally only need to 2KB.
Goroutine a stack, and the same operating system thread, it will save local variables of function calls active or pending, however, and is not the same OS thread, a goroutine stack size is not fixed; stack size will be based on We need to dynamically stretch.
The maximum stack goroutine has 1GB, than traditional fixed-size thread stack is much greater, although under normal circumstances, most goroutine do not need such a large stack.
Translator own understanding:
Stack: storing data at the beginning of each Goruntine has a separate stack to. ( Goruntine Goruntine main points and other Goruntine, the difference is that the initial stack size )
Heap: Goruntine that needs to be shared by a plurality of data stored in the top of a pile.
Translator's Note End
As we all know, you can heap or stack allocated variables.
- Save the current stack
Goroutine
variables being used (Translator's Note: be understood as local variables). Once the function returns, the variable will pop up from the stack. - Stack store shared variable (global variables, etc.).
Let's look at a simple example, returns a single value:
func getFooValue() foo {
var result foo
// Do something
return result
}
When you call the function, result
the variable will be created in the current Goruntine stack when the function returns, the value will be passed to a copy of the recipient. The result
variable itself will be popped from the current Goruntine stack.
Although it still exists in memory, but it can no longer be accessed. And there may be other erased data variables.
Returning now see an example of a pointer:
func getFooPointer() *foo {
var result foo
// Do something
return &result
}
When you call the function, result
the variable will be created in the current Goruntine stack when the function returns, it passes a pointer to the recipient (copy of the variable address). If the result
variable current Goruntine popped from the stack, the receiver will not be able to access it. (Translator's Note: This situation is called "Memory Escape")
In this scenario, Go compiler will result
variables escape to a place where you can share variables: heap .
However, another case is passing pointers. E.g:
func main() {
p := &foo{}
f(p)
}
Because we call in the same Goroutine in f
, so p
the variables do not need to escape. It just pushed onto the stack, sub-functions can access it. (Translator's Note: No other variables Goruntine shared is stored on the stack can be)
For example, io.Reader
the Read
method signature, receiving a slice parameter, the contents read slice, returns the number of bytes read. Instead of returning slices of reading. (Translator's Note: If the return slice, slice will escape to the heap.)
type Reader interface {
Read(p []byte) (n int, err error)
}
Why stack so fast? There are two main reasons:
- Stack does not require garbage collector. Like we said, the variables will be pushed onto the stack once created, once the function will return from the stack. It does not require a complicated process to reclaim unused variables.
- Store variables need to consider synchronization. Goroutines belongs to a stack, and therefore, does not require synchronization as compared to the storage variable in the variable storage heap.
In short, when you create a function, our default behavior should be to use a value rather than a pointer. Only if we only want to use the shared variable pointer.
If we are experiencing performance problems, you can use the go build -gcflags "-m -m"
command to display the compiler variable escape to the specific operation of the heap.
Again, for most everyday use cases, the transfer value is the most appropriate.
Further reading
Unexpected break
If f
returns true, the example below, what happens?
for {
switch f() {
case true:
break
case false:
// Do something
}
}
We will call break
statement. However, it will be break
a switch
statement, rather than for
cycling.
same question:
for {
select {
case <-ch:
// Do something
case <-ctx.Done():
break
}
}
break
And select
statements related to, and for
independent of the cycle.
break
A for/switch或for/select
One solution is to use tagged BREAK , as follows:
loop:
for {
select {
case <-ch:
// Do something
case <-ctx.Done():
break loop
}
}
Missing context error
Go in error handling still needs to be improved, so that now Go2 error handling is the most anticipated demand.
The current standard library (before Go 1.13) only error
constructor will naturally missing other information.
Let's look at pkg / errors think tank in error handling:
An error should be handled only once. Logging an error is handling an error. So an error should either be logged or propagated.
(Translation: error should be handled only once recorded. Log error is in error handling so, the error should be recorded or spread.)
For the current standard library, it is difficult to do this because we want to add some context information to error, it has a hierarchical structure.
For example: the desired REST
call results in the sample database issues:
unable to server HTTP POST request for customer 1234
|_ unable to insert customer contract abcd
|_ unable to commit transaction
If we use pkg/errors
, you can do this:
func postHandler(customer Customer) Status {
err := insert(customer.Contract)
if err != nil {
log.WithError(err).Errorf("unable to server HTTP POST request for customer %s", customer.ID)
return Status{ok: false}
}
return Status{ok: true}
}
func insert(contract Contract) error {
err := dbQuery(contract)
if err != nil {
return errors.Wrapf(err, "unable to insert customer contract %s", contract.ID)
}
return nil
}
func dbQuery(contract Contract) error {
// Do something then fail
return errors.New("unable to commit transaction")
}
If it is not returned by the initial external libraries error
can be used error.New
to create. An intermediate layer of insert
this error add additional context information. Finally passed log
to handle errors error. Each level either return an error or handling error.
We may also want to check the cause of the error to interpret whether they should try again. Suppose we have a library from the outside of db
the package to handle database access. The library may return a named db.DBError
temporary error. To determine whether you need to try again, we have to check the cause of the error:
Use pkg/errors
provided errors.Cause
can determine the cause of the error.
func postHandler(customer Customer) Status {
err := insert(customer.Contract)
if err != nil {
switch errors.Cause(err).(type) {
default:
log.WithError(err).Errorf("unable to server HTTP POST request for customer %s", customer.ID)
return Status{ok: false}
case *db.DBError:
return retry(customer)
}
}
return Status{ok: true}
}
func insert(contract Contract) error {
err := db.dbQuery(contract)
if err != nil {
return errors.Wrapf(err, "unable to insert customer contract %s", contract.ID)
}
return nil
}
A common mistake I've seen is partially used pkg/errors
. For example, by checking for errors in this way:
switch err.(type) {
default:
log.WithError(err).Errorf("unable to server HTTP POST request for customer %s", customer.ID)
return Status{ok: false}
case *db.DBError:
return retry(customer)
}
In this example, if you db.DBError
are wrapped
, it will never be executed retry
.
Further reading
Don’t just check errors, handle them gracefully
The expansion is being sliced
Sometimes we know the final length of the slices. Suppose we want to Foo
slice converted into Bar
slices, which means that the length of the two sections is the same.
I often see in the following sections initializes:
var bars []Bar
bars := make([]Bar, 0)
Not a magical slice of data structure, if there is no more space available, it will be double the expansion. In this case, it will automatically create a slice (higher capacity), and copy its elements.
If you want to accommodate thousands of elements, imagine how many times we need expansion. Although the time complexity of insertion O(1)
, but it will have an impact on performance.
So, if we know the final length, we can:
It is initialized with a predefined length
func convert(foos []Foo) []Bar { bars := make([]Bar, len(foos)) for i, foo := range foos { bars[i] = fooToBar(foo) } return bars }
Or by using a predefined length of 0 and initialize it Capacity:
func convert(foos []Foo) []Bar { bars := make([]Bar, 0, len(foos)) for _, foo := range foos { bars = append(bars, fooToBar(foo)) } return bars }
There is no specification of Context
context.Context
Often misused. According to the official document:
A Context carries a deadline, a cancelation signal, and other values across API boundaries.
This description is very vague, that it makes some people use it confused.
Let us try to describe it in detail. Context
It may include:
- A DEADLINE (deadline). It means that after the expiration of (250ms or after a specified date), we must stop the ongoing operation (
I/O
request, waiting forchannel
input, etc.). - A Cancelation Signal (cancel signal). Once we receive the signal, we must stop the ongoing activities. For example, suppose we receive two requests: one for the insertion of some data, other requests to cancel the first. This can be used in the first call
cancelable
to implement context, once we get the second request, the context will be canceled. - A list of key / value (key / value lists) are based on
interface{}
the type.
It is worth mentioning that, Context can be combined . For example, we can inherit a deadline and with key / value list Context
. In addition, more goroutines
can share the same Context
, a cancellation Context
may stop multiple activities.
Back to our topic, give an example I have experienced.
Based urfave / cli ( if you do not know, this is a good library, you can create a command-line application in Go ) created Go application. Once started, the program will inherit the parent Context
. This means that when the application is stopped, will use this Context
to send a signal to cancel.
My experience is that this Context
is a call to gRPC
direct transfer, which is not what I want to do. Instead, I want to stop or not the application operation 100 ms after transmitting the cancellation request.
For this reason, you can simply create a combination Context
. If parent
a parent Context
's name ( created by urfave / CLI ), then a combination of the following operations:
ctx, cancel := context.WithTimeout(parent, 100 * time.Millisecond)
response, err := grpcClient.Send(ctx, request)
Context
Not complicated, in my opinion, it can be described as one of the best features of Go.
Further reading
Forgotten -race parameters
A mistake I often see that in the absence of -race
test parameters Go application under the circumstances.
As this report said, although the Go "is designed to make concurrent programming easier and less error-prone," but we are still experiencing a lot of concurrency issues.
Obviously, Go competition detector can not solve every concurrency problem. However, it still has great value, we should always enable it when testing applications.
Further reading
Does the Go race detector catch all data race bugs?
More perfect package
Another common mistake is to pass the file name to the function.
Suppose we implement a number of blank lines function to calculate the file. The initial implementation is this:
func count(filename string) (int, error) {
file, err := os.Open(filename)
if err != nil {
return 0, errors.Wrapf(err, "unable to open %s", filename)
}
defer file.Close()
scanner := bufio.NewScanner(file)
count := 0
for scanner.Scan() {
if scanner.Text() == "" {
count++
}
}
return count, nil
}
filename
As the given parameters, and then we open the file, and then implement the logic to read a blank line, ah, no problem.
Suppose we want to realize this function on the unit test, and using the normal file, an empty file, encoded with different types of files to be tested. The code can easily become very difficult to maintain.
In addition, if we want to HTTP Body
achieve the same logic, you would have to create another function for this purpose.
Go interfaces designed two great: io.Reader
and io.Writer
(Translator's Note: Common IO command line, files, network, etc.)
So you can pass an abstract data source io.Reader
instead of passing the file name.
Just think about the statistics file? A HTTP body? Byte buffer?
The answer is not as important as whether it is Reader
to read what type of data, we will use the same Read
method.
In our example, even in a progressive input buffer read it (using bufio.Reader
its ReadLine
method):
func count(reader *bufio.Reader) (int, error) {
count := 0
for {
line, _, err := reader.ReadLine()
if err != nil {
switch err {
default:
return 0, errors.Wrapf(err, "unable to read")
case io.EOF:
return count, nil
}
}
if len(line) == 0 {
count++
}
}
}
Logical files now open to the call count
by:
file, err := os.Open(filename)
if err != nil {
return errors.Wrapf(err, "unable to open %s", filename)
}
defer file.Close()
count, err := count(bufio.NewReader(file))
Regardless of the data source, you can be called count
. And it will also facilitate unit testing, because you can create from a string bufio.Reader
, which greatly improves efficiency.
count, err := count(bufio.NewReader(strings.NewReader("input")))
Goruntines the loop variable
Finally, a common mistake I've seen is the use of Goroutines and loop variable.
The following example will output what?
ints := []int{1, 2, 3}
for _, i := range ints {
go func() {
fmt.Printf("%v\n", i)
}()
}
Output out of order 1 2 3
? He got it wrong.
In this example, each instance variables Goroutine share the same, so the most likely output 3 3 3
.
There are two solutions to solve this problem.
The first is to i
pass the value to the variable closure (internal function):
ints := []int{1, 2, 3}
for _, i := range ints {
go func(i int) {
fmt.Printf("%v\n", i)
}(i)
}
The second is for
to create another variable within the loop range:
ints := []int{1, 2, 3}
for _, i := range ints {
i := i
go func() {
fmt.Printf("%v\n", i)
}()
}
i := i
It may seem strange, but it is completely valid.
Because in the cycle means that in another scope, it is i := i
the equivalent of creating another called i
instance variables.
Of course, for ease of reading, it is preferable to use a different variable name.
Further reading
Using goroutines on loop iterator variables
You also mention other common mistakes? Please feel free to share, to continue the discussion;)