table of Contents
1 Overview
Recent studies JDK source code, found in the IO system BufferedInputStream
, is very interesting, usually there are many misconceptions about this class, so to write this blog, for learning
2 BufferedInputStream source code analysis
/**
* 此类继承FilterInputStream,该类使用了装饰着设计模式,FilterInputStream的源码超级简单
*/
public class BufferedInputStream extends FilterInputStream {
// 默认的buf[]缓存数组大小
private static int DEFAULT_BUFFER_SIZE = 8192;
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*
* buf[]缓存数组最大值 为什么会 减去8呢?因为一些JVM会数组头部存一些数据
*/
private static int MAX_BUFFER_SIZE = Integer.MAX_VALUE - 8;
/**
* The internal buffer array where the data is stored. When necessary,
* it may be replaced by another array of
* a different size.
*
* 缓存数组,核心成员变量,所有操作都是围绕buf[]
*/
protected volatile byte buf[];
/**
* Atomic updater to provide compareAndSet for buf. This is
* necessary because closes can be asynchronous. We use nullness
* of buf[] as primary indicator that this stream is closed. (The
* "in" field is also nulled out on close.)
*
* 多线程相关,确保操作线程安全
*/
private static final
AtomicReferenceFieldUpdater<BufferedInputStream, byte[]> bufUpdater =
AtomicReferenceFieldUpdater.newUpdater
(BufferedInputStream.class, byte[].class, "buf");
/**
* The index one greater than the index of the last valid byte in
* the buffer.
* This value is always
* in the range <code>0</code> through <code>buf.length</code>;
* elements <code>buf[0]</code> through <code>buf[count-1]
* </code>contain buffered input data obtained
* from the underlying input stream.
*
* buf[]数组中,有效数据的总数
*/
protected int count;
/**
* The current position in the buffer. This is the index of the next
* character to be read from the <code>buf</code> array.
* <p>
* This value is always in the range <code>0</code>
* through <code>count</code>. If it is less
* than <code>count</code>, then <code>buf[pos]</code>
* is the next byte to be supplied as input;
* if it is equal to <code>count</code>, then
* the next <code>read</code> or <code>skip</code>
* operation will require more bytes to be
* read from the contained input stream.
*
* @see java.io.BufferedInputStream#buf
*
* buf[]数组中,当前读取位置
*/
protected int pos;
/**
* The value of the <code>pos</code> field at the time the last
* <code>mark</code> method was called.
* <p>
* This value is always
* in the range <code>-1</code> through <code>pos</code>.
* If there is no marked position in the input
* stream, this field is <code>-1</code>. If
* there is a marked position in the input
* stream, then <code>buf[markpos]</code>
* is the first byte to be supplied as input
* after a <code>reset</code> operation. If
* <code>markpos</code> is not <code>-1</code>,
* then all bytes from positions <code>buf[markpos]</code>
* through <code>buf[pos-1]</code> must remain
* in the buffer array (though they may be
* moved to another place in the buffer array,
* with suitable adjustments to the values
* of <code>count</code>, <code>pos</code>,
* and <code>markpos</code>); they may not
* be discarded unless and until the difference
* between <code>pos</code> and <code>markpos</code>
* exceeds <code>marklimit</code>.
*
* @see java.io.BufferedInputStream#mark(int)
* @see java.io.BufferedInputStream#pos
*
* 最后一次,调用mark方法,标记的位置
*/
protected int markpos = -1;
/**
* The maximum read ahead allowed after a call to the
* <code>mark</code> method before subsequent calls to the
* <code>reset</code> method fail.
* Whenever the difference between <code>pos</code>
* and <code>markpos</code> exceeds <code>marklimit</code>,
* then the mark may be dropped by setting
* <code>markpos</code> to <code>-1</code>.
*
* @see java.io.BufferedInputStream#mark(int)
* @see java.io.BufferedInputStream#reset()
*
* 该变量唯一入口就是mark(int readLimit),比如调用方法,mark(1024),那么后面读取的数据如果
* 超过了1024字节,那么此次mark就为无效标记,子类可以选择抛弃该mark标记,从头开始。不过具体实现
* 跟具体的子类有关,在BufferedInputStream中,会抛弃mark标记,重新将markpos赋值为-1
*/
protected int marklimit;
/**
* Check to make sure that underlying input stream has not been
* nulled out due to close; if not return it;
*
* 获取真正的输入流
*/
private InputStream getInIfOpen() throws IOException {
InputStream input = in;
if (input == null)
throw new IOException("Stream closed");
return input;
}
/**
* Check to make sure that buffer has not been nulled out due to
* close; if not return it;
*
* 获取缓存数组
*/
private byte[] getBufIfOpen() throws IOException {
byte[] buffer = buf;
if (buffer == null)
throw new IOException("Stream closed");
return buffer;
}
/**
* Creates a <code>BufferedInputStream</code>
* and saves its argument, the input stream
* <code>in</code>, for later use. An internal
* buffer array is created and stored in <code>buf</code>.
*
* @param in the underlying input stream.
*
* 默认缓存数组大小为8kb
*/
public BufferedInputStream(InputStream in) {
this(in, DEFAULT_BUFFER_SIZE);
}
/**
* Creates a <code>BufferedInputStream</code>
* with the specified buffer size,
* and saves its argument, the input stream
* <code>in</code>, for later use. An internal
* buffer array of length <code>size</code>
* is created and stored in <code>buf</code>.
*
* @param in the underlying input stream.
* @param size the buffer size.
* @exception IllegalArgumentException if {@code size <= 0}.
*/
public BufferedInputStream(InputStream in, int size) {
super(in);
if (size <= 0) {
throw new IllegalArgumentException("Buffer size <= 0");
}
buf = new byte[size];
}
/**
* Fills the buffer with more data, taking into account
* shuffling and other tricks for dealing with marks.
* Assumes that it is being called by a synchronized method.
* This method also assumes that all data has already been read in,
* hence pos > count.
*
* 该方法作用,通过丢弃buf[]数据、增大buf[]数组,以腾出位置,将输入流中新的数据保存到buf[]缓存数组中
*/
private void fill() throws IOException {
byte[] buffer = getBufIfOpen();
if (markpos < 0)
// 因为没有mark标记,直接丢弃buf[]数据
pos = 0; /* no mark: throw away the buffer */
else if (pos >= buffer.length) /* no room left in buffer */
if (markpos > 0) { /* can throw away early part of the buffer */
int sz = pos - markpos;
System.arraycopy(buffer, markpos, buffer, 0, sz);
pos = sz;
markpos = 0;
// !!!往下执行,markpos全部等于0
} else if (buffer.length >= marklimit) {
markpos = -1; /* buffer got too big, invalidate mark */
pos = 0; /* drop buffer contents */
} else if (buffer.length >= MAX_BUFFER_SIZE) {
throw new OutOfMemoryError("Required array size too large");
} else { /* grow buffer */
int nsz = (pos <= MAX_BUFFER_SIZE - pos) ?
pos * 2 : MAX_BUFFER_SIZE;
if (nsz > marklimit)
// buf[]长度不超过marklimit,这样mark标记始终有效
nsz = marklimit;
byte nbuf[] = new byte[nsz];
System.arraycopy(buffer, 0, nbuf, 0, pos);
if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
// Can't replace buf if there was an async close.
// Note: This would need to be changed if fill()
// is ever made accessible to multiple threads.
// But for now, the only way CAS can fail is via close.
// assert buf == null;
throw new IOException("Stream closed");
}
buffer = nbuf;
}
count = pos;
// 将输入流中的数据独到buf[]数组中
int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
if (n > 0)
count = n + pos;
}
/**
* See
* the general contract of the <code>read</code>
* method of <code>InputStream</code>.
*
* @return the next byte of data, or <code>-1</code> if the end of the
* stream is reached.
* @exception IOException if this input stream has been closed by
* invoking its {@link #close()} method,
* or an I/O error occurs.
* @see java.io.FilterInputStream#in
*/
public synchronized int read() throws IOException {
// 说明当前buf[]数组大小不够了,需要fill()
if (pos >= count) {
fill();
// 说明没有读取到任何数据
if (pos >= count)
return -1;
}
return getBufIfOpen()[pos++] & 0xff;
}
/**
* Read characters into a portion of an array, reading from the underlying
* stream at most once if necessary.
*/
private int read1(byte[] b, int off, int len) throws IOException {
int avail = count - pos;
if (avail <= 0) {
/* If the requested length is at least as large as the buffer, and
if there is no mark/reset activity, do not bother to copy the
bytes into the local buffer. In this way buffered streams will
cascade harmlessly. */
// !!!这个位置代码很重要
// !!!这个位置代码很重要
// !!!这个位置代码很重要
/**
* 当写入指定数组b的长度大小超过BufferedInputStream中核心缓存数组buf[]的大小并且
* markpos < 0,那么就直接从数据流中读取数据给b数组,而不通过buf[]缓存数组,避免buf[]数组急剧增大
*
*/
if (len >= getBufIfOpen().length && markpos < 0) {
return getInIfOpen().read(b, off, len);
}
fill();
avail = count - pos;
if (avail <= 0) return -1;
}
int cnt = (avail < len) ? avail : len;
System.arraycopy(getBufIfOpen(), pos, b, off, cnt);
pos += cnt;
return cnt;
}
/**
* Reads bytes from this byte-input stream into the specified byte array,
* starting at the given offset.
*
* <p> This method implements the general contract of the corresponding
* <code>{@link InputStream#read(byte[], int, int) read}</code> method of
* the <code>{@link InputStream}</code> class. As an additional
* convenience, it attempts to read as many bytes as possible by repeatedly
* invoking the <code>read</code> method of the underlying stream. This
* iterated <code>read</code> continues until one of the following
* conditions becomes true: <ul>
*
* <li> The specified number of bytes have been read,
*
* <li> The <code>read</code> method of the underlying stream returns
* <code>-1</code>, indicating end-of-file, or
*
* <li> The <code>available</code> method of the underlying stream
* returns zero, indicating that further input requests would block.
*
* </ul> If the first <code>read</code> on the underlying stream returns
* <code>-1</code> to indicate end-of-file then this method returns
* <code>-1</code>. Otherwise this method returns the number of bytes
* actually read.
*
* <p> Subclasses of this class are encouraged, but not required, to
* attempt to read as many bytes as possible in the same fashion.
*
* @param b destination buffer.
* @param off offset at which to start storing bytes.
* @param len maximum number of bytes to read.
* @return the number of bytes read, or <code>-1</code> if the end of
* the stream has been reached.
* @exception IOException if this input stream has been closed by
* invoking its {@link #close()} method,
* or an I/O error occurs.
*
* 该方法主要调用read1(byte[] b, int off, int len)
*/
public synchronized int read(byte b[], int off, int len)
throws IOException
{
getBufIfOpen(); // Check for closed stream
if ((off | len | (off + len) | (b.length - (off + len))) < 0) {
throw new IndexOutOfBoundsException();
} else if (len == 0) {
return 0;
}
int n = 0;
for (;;) {
int nread = read1(b, off + n, len - n);
if (nread <= 0)
return (n == 0) ? nread : n;
n += nread;
if (n >= len)
return n;
// if not closed but no bytes available, return
InputStream input = in;
if (input != null && input.available() <= 0)
return n;
}
}
/**
* See the general contract of the <code>skip</code>
* method of <code>InputStream</code>.
*
* @exception IOException if the stream does not support seek,
* or if this input stream has been closed by
* invoking its {@link #close()} method, or an
* I/O error occurs.
*
* 跳过流中指定字节数,感觉该方法用处不大,至少到目前为止,我本人还从来没有用过skip方法
*/
public synchronized long skip(long n) throws IOException {
getBufIfOpen(); // Check for closed stream
if (n <= 0) {
return 0;
}
long avail = count - pos;
if (avail <= 0) {
// If no mark position set then don't keep in buffer
if (markpos <0)
return getInIfOpen().skip(n);
// Fill in buffer to save bytes for reset
fill();
avail = count - pos;
if (avail <= 0)
return 0;
}
long skipped = (avail < n) ? avail : n;
pos += skipped;
return skipped;
}
/**
* Returns an estimate of the number of bytes that can be read (or
* skipped over) from this input stream without blocking by the next
* invocation of a method for this input stream. The next invocation might be
* the same thread or another thread. A single read or skip of this
* many bytes will not block, but may read or skip fewer bytes.
* <p>
* This method returns the sum of the number of bytes remaining to be read in
* the buffer (<code>count - pos</code>) and the result of calling the
* {@link java.io.FilterInputStream#in in}.available().
*
* @return an estimate of the number of bytes that can be read (or skipped
* over) from this input stream without blocking.
* @exception IOException if this input stream has been closed by
* invoking its {@link #close()} method,
* or an I/O error occurs.
*
* buf[]数组剩余字节数+输入流中剩余字节数
*/
public synchronized int available() throws IOException {
int n = count - pos;
int avail = getInIfOpen().available();
return n > (Integer.MAX_VALUE - avail)
? Integer.MAX_VALUE
: n + avail;
}
/**
* See the general contract of the <code>mark</code>
* method of <code>InputStream</code>.
*
* @param readlimit the maximum limit of bytes that can be read before
* the mark position becomes invalid.
* @see java.io.BufferedInputStream#reset()
*
* 标记位置,marklimit只有在这里才能够被赋值,readlimit表示mark()方法执行后,最多能够从流中
* 读取的数据,如果超过该字节大小,那么在fill()的时候,就会认为此mark()标记无效,重新将
* markpos = -1,pos = 0
*/
public synchronized void mark(int readlimit) {
marklimit = readlimit;
markpos = pos;
}
/**
* See the general contract of the <code>reset</code>
* method of <code>InputStream</code>.
* <p>
* If <code>markpos</code> is <code>-1</code>
* (no mark has been set or the mark has been
* invalidated), an <code>IOException</code>
* is thrown. Otherwise, <code>pos</code> is
* set equal to <code>markpos</code>.
*
* @exception IOException if this stream has not been marked or,
* if the mark has been invalidated, or the stream
* has been closed by invoking its {@link #close()}
* method, or an I/O error occurs.
* @see java.io.BufferedInputStream#mark(int)
*/
public synchronized void reset() throws IOException {
getBufIfOpen(); // Cause exception if closed
if (markpos < 0)
throw new IOException("Resetting to invalid mark");
pos = markpos;
}
/**
* Tests if this input stream supports the <code>mark</code>
* and <code>reset</code> methods. The <code>markSupported</code>
* method of <code>BufferedInputStream</code> returns
* <code>true</code>.
*
* @return a <code>boolean</code> indicating if this stream type supports
* the <code>mark</code> and <code>reset</code> methods.
* @see java.io.InputStream#mark(int)
* @see java.io.InputStream#reset()
*/
public boolean markSupported() {
return true;
}
/**
* Closes this input stream and releases any system resources
* associated with the stream.
* Once the stream has been closed, further read(), available(), reset(),
* or skip() invocations will throw an IOException.
* Closing a previously closed stream has no effect.
*
* @exception IOException if an I/O error occurs.
*/
public void close() throws IOException {
byte[] buffer;
while ( (buffer = buf) != null) {
if (bufUpdater.compareAndSet(this, buffer, null)) {
InputStream input = in;
in = null;
if (input != null)
input.close();
return;
}
// Else retry in case a new buf was CASed in fill()
}
}
}
3 BufferedInputStream in the actual scene, there is not much use
Many online blog that BufferedInputStream
is useful, can be read from the IO-time in a lot of data, then the cache in buf [], so that you reduce the consumption of IO, a lot of bloggers, and even gives some practical operation of the code, proof BufferedInputStream
indeed improve efficiency, which in itself is no problem, but I was after in-depth study of the source code, but found the actual scene, the frequency of use of such small, do not needBufferedInputStream
I will combine the code, a more powerful explanation:
// file文件大小1个G
private static String file = "D:\\StudySoftware\\VMware_virtualbox\\Data_vmware\\VMwareMachine\\kafka_single\\kafka-single-103-da5cf665.vmem";
private static void file() throws IOException{
long beginTime = System.currentTimeMillis();
FileInputStream input = new FileInputStream(file);
byte[] bytes = new byte[1024 * 1];
int read = 0;
while ((read = input.read(bytes, 0, bytes.length)) != -1) {
// 不执行任何操作,仅仅读取文件
}
long endTime = System.currentTimeMillis();
System.out.println("file: 耗费时间:" + (endTime - beginTime));
}
private static void bufferd() throws IOException{
long beginTime = System.currentTimeMillis();
FileInputStream input = new FileInputStream(file);
BufferedInputStream bufferedInput = new BufferedInputStream(input);
byte[] bytes = new byte[1024 * 1];
int read = 0;
while ((read = bufferedInput.read(bytes, 0, bytes.length)) != -1) {
//不执行任何操作,仅仅读取文件
}
long endTime = System.currentTimeMillis();
System.out.println("buffered: 耗费时间:" + (endTime - beginTime));
}
note:
When the operation codes, two methods can not be performed on the same file operation, prevent the JVM optimized automatically, as a first method of reading the entire document, when the second read method, the stored partial information may JVM, whereby resulting test data is not accurate. And in order to ensure maximum accuracy of test data, the JVM startup time, only a test method
result:
① When the byte [] bytes = new byte [1024 * 1]; array size of 1024
buffered: time-consuming: 855
File: time-consuming: 3073
② When byte [] bytes = new byte [1024 * 2]; array size of 2018
buffered: time-consuming: 813
File: time-consuming: 1909
③ When byte [] bytes = new byte [1024 * 3]; array size is 3072
buffered: time-consuming: 1304
File: time-consuming: 1476
④ When the byte [] bytes = new byte [1024 * 4]; array size of 4096
buffered: time-consuming: 844
File: time-consuming: 1287
⑤ When byte [] bytes = new byte [1024 * 5]; array size of 5120
buffered: time-consuming: 1343
File: time-consuming: 1061
⑥ When byte [] bytes = new byte [1024 * 6]; array size of 6144
buffered: time-consuming: 1280
File: time-consuming: 985
⑦ When byte [] bytes = new byte [1024 * 7]; array size of 7168
buffered: time-consuming: 1443
File: time-consuming: 851
⑧ When byte [] bytes = new byte [1024 * 8]; array size is 8192
buffered: time-consuming: 774
File: time-consuming: 739
⑨ When byte [] bytes = new byte [1024 * 9]; array size of 9216
buffered: time-consuming: 734
File: time-consuming: 749
⑩ When byte [] bytes = new byte [1024 * 10]; array size 10240
buffered: time-consuming: 739
File: time-consuming: 697
... ... ...
We can draw the following important conclusions:
When comparing hours, bytes BufferedInputStream
actually a lot faster when reading the file, but when bytes gradually increased, especially to reach 8kb, we will find BufferedInputStream
and FileInputStream
read files the same speed, no significant difference
Our in-depth source code, you can find:
So when we put while ((read = input.read(bytes, 0, bytes.length)) != -1)
upon the bytes increase BufferedInputStream
has no effect ( unless there is a mark, reset demand )
Some junior partner, would certainly say that I will be BufferedInputStream
the buf [] the size of the increase is not on line yet?
Can be, but I will be while ((read = input.read(bytes, 0, bytes.length)) != -1)
in bytes size increase is not on the list? The final analysis are byte array, a is BufferedInputStream
outside, in an BufferedInputStream
inside, and we now be read from the stream, are often not required mark, reset operation, and we set the outer size bytes are usually relatively large, this time, can not useBufferedInputStream
4 BufferedInputStream unique usage scenarios
I personally think that BufferedInputStream
the only usage scenario is that when we need mark, reset characteristics. But pay special attention to the use of mark, reset, and which involves a lot of things, especially when BufferedInputStream
executed fill () operation
public static void main(String[] args) {
try {
final byte[] src = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20};
final ByteArrayInputStream bis = new ByteArrayInputStream(src);
final BufferedInputStream bufis = new BufferedInputStream(bis, 5);
int data = -1;
int i = 0;
while((data = bufis.read()) != -1) {
if(data == 4) {
bufis.mark(2);
}
if(i++ == 9) {
bufis.reset();
}
System.out.printf("%d", data);
}
} catch(IOException ioex) {
ioex.printStackTrace();
}
}
// 原文链接:https://blog.csdn.net/qq_26971305/article/details/79472696
Interested friends, you can debug the above code, debug the following cases, corresponding you to BufferedInputStream
have a deeper understanding
if(i++ == 5)
if(i++ == 6)
if(i++ == 7)
if(i++ == 8)
if(i++ == 9)
if(i++ == 10)
... ... ... time and more friends, may be provided BufferedInputStream
in buf [] of length and the size if (i ++ == xx) value is determined to look at the statement BufferedInputStream
execution flow class
mark, reset characteristics not be used indiscriminately, or will throw an exception
public synchronized void reset() throws IOException {
getBufIfOpen(); // Cause exception if closed
if (markpos < 0)
throw new IOException("Resetting to invalid mark");
pos = markpos;
}
Reference Links:
https://blog.csdn.net/qq_26971305/article/details/79472696
Source: https://www.cnblogs.com/AdaiCoffee/
In this paper, learn, and share research-based, welcome to reprint. If there is nothing wrong or the wrong place paper also pointed out that hope, so as not to harm the younger generation. If you have better ideas and opinions, comments can discuss, thank you!