Detailed explanation of bufio package in Golang (4): bufio.Scanner

bufio.Scanner

bufio.Scanner is a scanner in Golang for reading the input buffer one by one, usually used together with bufio.Reader, bufio.Reader is used to read data from the input, and bufio.Scanner is used to read the input one by one The contents of the buffer.

bufio.Scanner can break the input data into logical lines and return them. Scanner breaks the input into lines by defining a Split function. The structure definition and corresponding methods are as follows:

type Scanner struct {
	r            io.Reader // The reader provided by the client.
	split        SplitFunc // The function to split the tokens.
	maxTokenSize int       // Maximum size of a token; modified by tests.
	token        []byte    // Last token returned by split.
	buf          []byte    // Buffer used as argument to split.
	start        int       // First non-processed byte in buf.
	end          int       // End of data in buf.
	err          error     // Sticky error.
	empties      int       // Count of successive empty tokens.
	scanCalled   bool      // Scan has been called; buffer is in use.
	done         bool      // Scan has finished.
}

Here are some main methods provided by bufio.Writer:

  • func (s *Scanner) Scan() bool, used to read the next block of data in the input buffer and save it in the internal buffer. Returns true if the read was successful; false if all data has been read or an error occurred.
  • func (s *Scanner) Text() string, used to get the text content in the internal buffer, usually used with the Scan() method to get the read data.
  • func (s *Scanner) Bytes() []byte, used to obtain the byte content in the internal buffer, usually used with the Scan() method to obtain the read data.
  • func (s *Scanner) Err() error, used to get the error information when reading the input, if no error occurs during the reading process, then return nil; otherwise, return a non-nil error object.
  • func (s *Scanner) Buffer(buf []byte, max int), used to customize the size of the input buffer, accepts a parameter of type []byte, used to specify the size of the buffer.
  • func (s *Scanner) Split(split SplitFunc), used to specify a split function to split the input into multiple data blocks, accepting a func([]byte) bool type parameter, the function reads the input each time Called to determine whether the current data block needs to be divided into multiple small blocks. Usually used for processing very large data blocks to avoid problems such as memory overflow.

Example of use

A simple usage example is as follows:

package main

import (
	"bufio"
	"fmt"
	"strings"
)

func main() {
	input := "路多辛的所思所想\n很值得一看哦!\n"
	scanner := bufio.NewScanner(strings.NewReader(input))

	// 逐行遍历
	for scanner.Scan() {
		fmt.Println(scanner.Text())
	}

	// 错误处理
	if err := scanner.Err(); err != nil {
		fmt.Println("Error:", err)
	}

	// 自定义分隔符
	scanner = bufio.NewScanner(strings.NewReader("路多辛,的,所思所想"))
	scanner.Split(func(data []byte, atEOF bool) (advance int, token []byte, err error) {
		// 分隔符为逗号
		for i, d := range data {
			if d == ',' {
				return i + 1, data[:i], nil
			}
		}
		if atEOF && len(data) > 0 {
			return len(data), data, nil
		}
		return 0, nil, nil
	})
	for scanner.Scan() {
		fmt.Println(scanner.Text())
	}
}

Run to see the effect

$ go run main.go
路多辛的所思所想
很值得一看哦
路多辛
的
所思所想

In the first example, the default split method is used, which reads the input line by line. In the second example, a custom delimiter is used to separate the input strings by commas.

summary

When bufio.Scanner reads the buffer, it will save the read data in the internal buffer. Therefore, each time the scanner.Scan() method is called, a new block of data is read from the input and stored in an internal buffer. If you need to read all the data in the input buffer, you need to keep calling the scanner.Scan() method until it returns false.

Guess you like

Origin blog.csdn.net/luduoyuan/article/details/131362023
Recommended