Golang's bufio source code analysis

Originally, I just wanted to use bufio, but none of the online articles clearly stated how to use bufio, and what each method did.

Reader Analysis


func NewReaderSize(rd io.Reader, size int) *Reader {
    // Is it already a Reader?
    b, ok := rd.(*Reader)
    if ok && len(b.buf) >= size {
        return b
    }
    if size < minReadBufferSize { //minReadBufferSize==16
        size = minReadBufferSize
    }
    r := new(Reader)
    r.reset(make([]byte, size), rd)
    return r
}

// NewReader returns a new Reader whose buffer has the default size.
func NewReader(rd io.Reader) *Reader {
    return NewReaderSize(rd, defaultBufSize)
}

First pass io.Reader to create bufio.Reader, the default defaultBufSize is 4096 bytes or 4K bytes.


// fill reads a new chunk into the buffer.
func (b *Reader) fill() {
    // Slide existing data to beginning.
    if b.r > 0 { //把buf剩余可读的数据复制到最前
        copy(b.buf, b.buf[b.r:b.w])
        b.w -= b.r
        b.r = 0
    }

    if b.w >= len(b.buf) {//缓存已经溢出了
        panic("bufio: tried to fill full buffer")
    }
    //maxConsecutiveEmptyReads == 100
    // Read new data: try a limited number of times.
    for i := maxConsecutiveEmptyReads; i > 0; i-- {
        n, err := b.rd.Read(b.buf[b.w:]) //从io中读取数据写入缓存 
        if n < 0 {
            panic(errNegativeRead)
        }
        b.w += n //更新写入缓存的长度
        if err != nil {
            b.err = err
            return
        }
        if n > 0 {
            return
        }
              // n== 0时会循环尝试从io中读取,最多100次
    }
    b.err = io.ErrNoProgress //读了100次,n都为0
}

fill() copies the data of the remaining unread length to the cache head and resets r to 0, which is equivalent to moving the unread data to the head. At the same time, try to read data from io and write to the cache, which may not be full.


// ReadByte reads and returns a single byte.
// If no byte is available, returns an error.
func (b *Reader) ReadByte() (byte, error) {
    b.lastRuneSize = -1
    for b.r == b.w { //缓存中无数据可读
        if b.err != nil {
            return 0, b.readErr()
        }
        b.fill() // buffer is empty,从io中fill数据
    }
    c := b.buf[b.r]//此时肯定有数据了,取r位置的一个字节
    b.r++ //r游标移动一个字节
    b.lastByte = int(c)
    return c, nil
}

ReadByte() reads a byte from the cache and tries to fill data from io if there is no data in the cache. Finally, return the content of one byte read



// Read reads data into p.
// It returns the number of bytes read into p.
// The bytes are taken from at most one Read on the underlying Reader,
// hence n may be less than len(p).
// At EOF, the count will be zero and err will be io.EOF.
func (b *Reader) Read(p []byte) (n int, err error) {
    n = len(p)
    if n == 0 {
        return 0, b.readErr()
    }
    if b.r == b.w {//缓存中无数据可读
        if b.err != nil {
            return 0, b.readErr()
        }
        if len(p) >= len(b.buf) { //p的空间大于等于缓存
            // Large read, empty buffer.
            // Read directly into p to avoid copy.
            n, b.err = b.rd.Read(p)//直接从io中把数据读取到p中
            if n < 0 {
                panic(errNegativeRead)
            }
            if n > 0 {
                b.lastByte = int(p[n-1])
                b.lastRuneSize = -1
            }
            return n, b.readErr()
        }
        // One read.
        // Do not use b.fill, which will loop.
        // 无数据可读,表示buf中数据无用了则重置r和w的游标
        b.r = 0
        b.w = 0
        n, b.err = b.rd.Read(b.buf)//从io中读取到缓存
        if n < 0 {
            panic(errNegativeRead)
        }
        if n == 0 {
            return 0, b.readErr()
        }
        b.w += n//缓存写入了多少数据
    }

    // copy as much as we can
    n = copy(p, b.buf[b.r:b.w])//缓存中数据可读数据读取到p
    b.r += n //读了多少
    b.lastByte = int(b.buf[b.r-1])
    b.lastRuneSize = -1
    return n, nil
}

Read(p []byte):
There are two cases when there is no readable data in the cache:
Case 1. When the space of p is greater than or equal to the cache, the data is directly read from io to p.
Case 2. When the space of p is smaller than the cache, reset the cache cursor, try to read the data from io to the cache, and then copy it from the cache to p.

When there is readable data in the cache, the readable data is directly copied from the cache to p, and io will not be read at this time.

The space of p may be filled or not, and the returned n indicates how many bytes have been read.


// Peek returns the next n bytes without advancing the reader. The bytes stop
// being valid at the next read call. If Peek returns fewer than n bytes, it
// also returns an error explaining why the read is short. The error is
// ErrBufferFull if n is larger than b's buffer size.
func (b *Reader) Peek(n int) ([]byte, error) {
    if n < 0 {
        return nil, ErrNegativeCount
    }

    //剩余可读小于n而且小于缓存时从io里fill数据到缓存
    for b.w-b.r < n && b.w-b.r < len(b.buf) && b.err == nil {
        b.fill() // b.w-b.r < len(b.buf) => buffer is not full
    }
    //n比缓存大,返回可读的缓存切片,而且错误值为ErrBufferFull
    if n > len(b.buf) {
        return b.buf[b.r:b.w], ErrBufferFull
    }

    // 0 <= n <= len(b.buf)
    var err error
    if avail := b.w - b.r; avail < n {
        // not enough data in buffer
        //缓存的可读数据不够读,返回可以读的缓存切片及错误值ErrBufferFull
        n = avail
        err = b.readErr()
        if err == nil {
            err = ErrBufferFull
        }
    }
    //如果缓存的可读数据足够就返回可读缓存切片和空错误
    return b.buf[b.r : b.r+n], err
}

Peek(n int):
Fill the cache from io first.
If the n to be read is larger than the buffer, the error value ErrBufferFull is returned.
If n is less than or equal to the cache, and the readable data is not enough to read, ErrBufferFull or io read error is returned.
The error returned is null if there is enough data to read.

Either way, a cache-readable slice is returned but without moving the read cursor, modifying the returned slice will affect the data in the cache.


// Buffered returns the number of bytes that can be read from the current buffer.
func (b *Reader) Buffered() int { return b.w - b.r }//可读长度

// Discard skips the next n bytes, returning the number of bytes discarded.
//
// If Discard skips fewer than n bytes, it also returns an error.
// If 0 <= n <= b.Buffered(), Discard is guaranteed to succeed without
// reading from the underlying io.Reader.
func (b *Reader) Discard(n int) (discarded int, err error) {
    if n < 0 {
        return 0, ErrNegativeCount
    }
    if n == 0 {
        return
    }
    remain := n
    for {
        skip := b.Buffered()
        if skip == 0 {//没可读先fill填充
            b.fill()
            skip = b.Buffered()
        }
        if skip > remain {
            skip = remain
        }
        b.r += skip //读游标直接跳过skip个字节
        remain -= skip
        if remain == 0 {//成功跳过请求的字节长度
            return n, nil
        }
        if b.err != nil {//出错了,返回剩下多少个字节没跳过和错误
            return n - remain, b.readErr()
        }
    }
}

Discard(n int) :
Skip n bytes without reading, and loop until successfully skipped or an error occurs.

I generally do not use other methods of Reader, so I will not analyze it.

Summarize:

**When the Peek return error is not empty, one case is that the length of your Peek is larger than the cache, so the data will never be enough, so be careful not to pass parameters larger than the cache.
Another situation is that you have tried to read data from io, but the preparation is not enough for the length you need. For example, the data of the network tcp did not arrive at the beginning. When you call Peek in the next round, the cache may be enough for you to read.
In fact, even if there is an error, it will return a slice that you can read.
If the error is empty, congratulations, the data is ready.
Peek will not move the read cursor. If you directly use the slice returned by Peek, you can cooperate with Discard to skip the specified bytes of data and no longer read it, that is, move the read cursor. **

Read tries to read data from the cache first. When the current cache has no data to read, first read from io and fill it into the cache, and then copy it from the cache. The length of the returned data is not necessarily, it is less than or equal to the length required by Read.

To skip n bytes without reading, use Discard.

When using NewReader to initialize, the default cache is 4096 bytes. In some cases, it may be too large to waste memory or too small to be enough. It is better to use NewReaderSize to customize the cache size according to the situation.


Writer


Write when you have time

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325689420&siteId=291194637