bufio.Scanner
bufio.Scanner is a scanner in Golang for reading the input buffer one by one, usually used together with bufio.Reader, bufio.Reader is used to read data from the input, and bufio.Scanner is used to read the input one by one The contents of the buffer.
bufio.Scanner can break the input data into logical lines and return them. Scanner breaks the input into lines by defining a Split function. The structure definition and corresponding methods are as follows:
type Scanner struct {
r io.Reader // The reader provided by the client.
split SplitFunc // The function to split the tokens.
maxTokenSize int // Maximum size of a token; modified by tests.
token []byte // Last token returned by split.
buf []byte // Buffer used as argument to split.
start int // First non-processed byte in buf.
end int // End of data in buf.
err error // Sticky error.
empties int // Count of successive empty tokens.
scanCalled bool // Scan has been called; buffer is in use.
done bool // Scan has finished.
}
Here are some main methods provided by bufio.Writer:
- func (s *Scanner) Scan() bool, used to read the next block of data in the input buffer and save it in the internal buffer. Returns true if the read was successful; false if all data has been read or an error occurred.
- func (s *Scanner) Text() string, used to get the text content in the internal buffer, usually used with the Scan() method to get the read data.
- func (s *Scanner) Bytes() []byte, used to obtain the byte content in the internal buffer, usually used with the Scan() method to obtain the read data.
- func (s *Scanner) Err() error, used to get the error information when reading the input, if no error occurs during the reading process, then return nil; otherwise, return a non-nil error object.
- func (s *Scanner) Buffer(buf []byte, max int), used to customize the size of the input buffer, accepts a parameter of type []byte, used to specify the size of the buffer.
- func (s *Scanner) Split(split SplitFunc), used to specify a split function to split the input into multiple data blocks, accepting a func([]byte) bool type parameter, the function reads the input each time Called to determine whether the current data block needs to be divided into multiple small blocks. Usually used for processing very large data blocks to avoid problems such as memory overflow.
Example of use
A simple usage example is as follows:
package main
import (
"bufio"
"fmt"
"strings"
)
func main() {
input := "路多辛的所思所想\n很值得一看哦!\n"
scanner := bufio.NewScanner(strings.NewReader(input))
// 逐行遍历
for scanner.Scan() {
fmt.Println(scanner.Text())
}
// 错误处理
if err := scanner.Err(); err != nil {
fmt.Println("Error:", err)
}
// 自定义分隔符
scanner = bufio.NewScanner(strings.NewReader("路多辛,的,所思所想"))
scanner.Split(func(data []byte, atEOF bool) (advance int, token []byte, err error) {
// 分隔符为逗号
for i, d := range data {
if d == ',' {
return i + 1, data[:i], nil
}
}
if atEOF && len(data) > 0 {
return len(data), data, nil
}
return 0, nil, nil
})
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}
Run to see the effect
$ go run main.go
路多辛的所思所想
很值得一看哦
路多辛
的
所思所想
In the first example, the default split method is used, which reads the input line by line. In the second example, a custom delimiter is used to separate the input strings by commas.
summary
When bufio.Scanner reads the buffer, it will save the read data in the internal buffer. Therefore, each time the scanner.Scan() method is called, a new block of data is read from the input and stored in an internal buffer. If you need to read all the data in the input buffer, you need to keep calling the scanner.Scan() method until it returns false.