java - io knowledge Pan-speaking (large file upload example)

Reference article

  1. java large file upload example: https: //blog.csdn.net/j7s9usu/article/details/86678534
  2. bio-nio-aio part of the knowledge reference:
    . 1) https://www.cnblogs.com/sxkgeek/p/9488703.html
    2) https://www.jianshu.com/p/362b365e1bcc
    . 3) HTTPS: // Blog .csdn.net / guanghuichenshao / article / details / 79375967
  3. mappedByteBuffer和ByteBuffer的区别:
    1)https://blog.csdn.net/love4amanda/article/details/90412482
    2)https://blog.csdn.net/qq_41969879/article/details/81629469

Knowledge Point

  1. Synchronous and asynchronous (synchronous / asynchronous): Synchronization is a reliable and orderly operation mechanism, when we synchronize, follow-up task is waiting for the current calls to return, will be the next step; and asynchronous, by contrast, is not other tasks need to wait for the current call to return, often rely on the event, such as a callback mechanism to implement the order of the relationship between tasks
  2. Blocking and non-blocking: During blocking operation, the current thread is blocked, unable to engage in other tasks only when the conditions are ready to continue, such as the establishment of new connections ServerSocket is completed, the data read or write operation is completed; and non-blocking IO is regardless of whether the operation is finished, return directly, the appropriate action to continue processing in the background

1. Classification io streams

Classification byte stream and a character stream
input stream of bytes: InputStream, output stream of bytes: OutputStream
character input stream: Reader, the output character stream: Writer

What is a byte stream?

Byte stream - the transmission process, the transmission data is a basic unit of a stream of bytes.

What is the character stream?

Character stream - the transmission process, the basic unit of data transmission is a stream of characters.

https://blog.csdn.net/chengyuqiang/article/details/79183748

Character stream buffer
https://blog.csdn.net/chenkaibsw/article/details/81606722
character stream so that the data is first stored in the cache, and then added to modify the file
resolve buffer
flush ()

was 和 nio

https://blog.csdn.net/ty497122758/article/details/78979302
https://www.cnblogs.com/zedosu/p/6666984.html
Bio: synchronous blocking
nio: synchronous non-blocking
aio: asynchronous non-blocking (asynchronous read and write operations, the buffer)

1. Sync: When synchronous IO, Java IO deal with their own reading and writing.

2. Asynchronous: when using asynchronous IO, IO read and write Java will be entrusted to handle OS, you need to pass data buffer address and size of the OS, Java OS notification process (callback) when finished.

3. blocking: When to use blocking IO, Java has been blocked call will write to complete before returning.

4. Non-blocking: When using a non-blocking IO, if not immediately read and write, Java call will return immediately, during read and write when the IO event notification dispatcher can read and write, read and write continuously loop until completion.

BIO (synchronous, blocking, stream-oriented)

  1. Overview: Bio is operated by a data stream io, io press operation data stream is divided into: a byte stream, a character stream by flow classification: Input and Output
  2. Character streams:
    1. only used to process text data, common manifestation is file
    2. Note: when to use the flush write () to refresh; need to shut down at the conclusion of
  3. Byte stream:
    1. for processing media data, because the operation is a byte, and the media files are stored in bytes.
    2. Operation can not refresh the stream operation
  4. Means for flow from the flow only from the sequential read one or more bytes, if you need to skip some bytes have been read or re-read byte, you have to read the stream of data to cache stand up

NIO (synchronous, non-blocking)

  1. Overview: NIO The reason is synchronous, because it is the accept / read / write method of core I / O operation will block the current thread, nio there are three components: Channel (Channel), Buffer (buffer), Selector (select device)

Channel (channel)

Channel (channel): Channel is an object, which can read and write data. It can be seen as an IO stream, except that:
. 1) Channel is bi-directional, both reading and writing may be, the flow is unidirectional and
2) can be read asynchronously Channel
3) reading of the Channel writeYou must pass through buffer objects

As mentioned above, all data is processed through Buffer objects, so you will never bytes are written directly into the Channel, on the contrary, you willData is written to the Buffer; Similarly, you will not read a byte from the Channel, but the dataRead from Channel Buffer, and then get the bytes from Buffer

Channel Java NIO mainly in the following types:
FileChannel: reading data from a file
DatagramChannel: UDP network protocol to read and write data
SocketChannel: read the TCP network protocol data
ServerSocketChannel: TCP connection can monitor

Buffer (buffer)

  1. Overview: In NIO, all data are treated with Buffer, which is read and write data transfer NIO pool. Buffer is essentially an array, usually a byte array, but may also be other types of arrays. However, a buffer is not just an array importantly, it provides a structured access to data, but also can read and write process tracking system.
    Here Insert Picture Description
    Buffer read and write data using generally follow the following four steps:
    1. Write data to Buffer;
    2. Flip call () method; (the filp () that the buffer write conversion, in fact, change the position of the operation in the buffer and the like property)
    3. the data from the Buffer;
    4. invoke clear () method or a compact () method

Flip () method: Buffer switching from write mode to read mode, the reset position value is 0, the limit value is set to the previous value of the position;
Clear Buffer: call clear () or compact () method. clear () method clears the entire buffer. compact () method will only read data has been cleared. Any unread data is moved to the beginning of the buffer, the new data will be written back into the data buffer unread.

Buffer have the following main categories:

ByteBuffer ; CharBuffer; DoubleBuffer等…

  1. An important symbol:

buffer size / capacity - Capacity
as a memory block, Buffer has a fixed size value indicated by the parameter capacity.
Current read / write position - Position
when writing data to the buffer, position represents a position of a current to be written, position of the maximum capacity - 1; when data is read from the buffer, position indicates the current position read from.
Location information at the end - limit

  1. example
public static void copyFileUseNIO(String src,String dst) throws IOException{
//声明源文件和目标文件
        FileInputStream fi=new FileInputStream(new File(src));
        FileOutputStream fo=new FileOutputStream(new File(dst));
        //获得传输通道channel
        FileChannel inChannel=fi.getChannel();
        FileChannel outChannel=fo.getChannel();
        //获得容器buffer
        ByteBuffer buffer=ByteBuffer.allocate(1024);
        while(true){
            //判断是否读完文件
            int eof =inChannel.read(buffer);
            if(eof==-1){
                break;  
            }
            //重设一下buffer的position=0,limit=position
            buffer.flip();
            //开始写
            outChannel.write(buffer);
            //写完要重置buffer,重设position=0,limit=capacity
            buffer.clear();
        }
        inChannel.close();
        outChannel.close();
        fi.close();
        fo.close();
}   

Selector (selected object)

  1. Overview: Because the thread context switching overhead becomes significant at high concurrency, synchronization is blocked low scalability disadvantages, so there is a selector, you can register more than one channel to the selector, the centralized management monitoring, like this on one thread may be utilized to handle a plurality of channels, and the like corresponding to the thread pool.
  2. Created and registered template

// 1. First create a selector

Selector selector = Selector.open();
//注册的Channel 必须设置成异步模式 才可以,否则异步IO就无法工作,这就意味着我们不能把一个FileChannel注册到Selector,因为FileChannel没有异步模式,但是网络编程中的SocketChannel是可以的。
channel.configureBlocking(false);
//将channel注册到选择器中,并且让选择器监听channel的read事件
SelectionKey key =channel.register(selector,SelectionKey.OP_READ);
  1. Important attributes SelectionKey

This attribute contains the following fields,

  1. The interest set: is a specific event you listening channel of eg: SelectionKey.OP_READ
  2. A collection of ready operations: The ready set
  3. The Channel: obtaining channel registration
  4. The Selector: get selector
  5. An attached object (optional): binding target for the property

AIO (asynchronous, non-blocking)

  1. Overview: AIO is an acronym for asynchronous IO, although the NIO in network operation, provides non-blocking method, but IO NIO or synchronous behavior. For the NIO, the thread when our business is ready IO operations to be notified, and then they were operated by the IO thread itself, IO operation itself is synchronized.

But for AIO, it is more a step further, it is not then inform == thread in the IO ready, but after the IO operation has been completed, the thread to give notice. == So AIO is not blocked, then our business logic will become a callback function, wait for IO operation is complete, triggered automatically by the system.

Why NIO

High concurrency problems caused by the use of a conventional blocking I / O system, if the request is to use a conventional one thread this model, once a large number of concurrent requests high, there will be the following problems:

1, the thread is not enough, even if using a thread pool thread will not help reuse;

2, blocking I / O mode, there will be a large number of thread is blocked, the data has been waiting for this time thread is suspended, can only wait, CPU utilization, it is very low, in other words, the difference between the throughput of the system;

3, if the network I / O network congestion or there is a network failure or the like, or jitter, the thread may be blocked for a long time. The whole system becomes unreliable;

mappedByteBuffer and features ByteBuffer

Read each document flow

  1. BIO: ByteBuffer data to be copied a plurality of times; start page file -> physical memory (external memory heap) -> heap memory (VM) -> Operation
  2. NIO and BIO: ByteBuffer from page file -> physical memory (external memory heap) -> Operation
  3. mappedByteBuffer: From the Page File -> (external heap memory) of physical memory -> Options
    In the case where you can know the performance of NIO is also very good, but without taking into account memory footprint, one-time read into memory will also mappedByteBuffer than NIO to be faster.

map process

FileChannel provides a map method to map the file into virtual memory, usually can map the entire file, if the file is large, you can be segmented map.

FileChannel of several variables:
mapmode the MODE: mode memory-mapped file access, total of three kinds:
MapMode.READ_ONLY: read-only, attempt to modify the resulting buffer will result in an exception.
MapMode.READ_WRITE: read / write, changes to the buffer will eventually get written to the file; however, the changes to the map to the other programs of the same file is not necessarily visible.
MapMode.PRIVATE: private, read and write, but do not modify the content written to the file, but the buffer itself changes, this ability is called "copy on write".
position: starting position when the file mapping.
allocationGranularity: Memory allocation size for mapping buffers , initialized by the native function initIDs.

MappedByteBuffer advantages and disadvantages

  1. MappedByteBuffer use of virtual memory, so allocation (map) of memory size is not the JVM -Xmx parameter limits, but also a size limit.
  2. If, when the file exceeds the limits 1.5G, contents of the back position parameter file can be re-map.
  3. MappedByteBuffer indeed very high performance when dealing with large files, but there are some problems, such as memory footprint, the file is closed uncertainty, it is only open file is closed garbage collection will, and this point in time is uncertain.
    javadoc also mentioned: A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected *.

Large file uploads

Project Source: https: //github.com/jiaojiaoyow/git_demo/tree/master/big_file

A pit encountered:

  1. Execution mappedByteBuffer.put (fileData) reported [with root cause java.nio.ReadOnlyBufferException: null} Error
    reason:
    MappedByteBuffer MappedByteBuffer = the FileChannel.map (FileChannel.MapMode.READ_ONLY, offset, fileData.length);
    in MapMode.READ_ONLY write wrong, should be changed READ_WRITE
Published 36 original articles · won praise 11 · views 10000 +

Guess you like

Origin blog.csdn.net/s_xchenzejian/article/details/104051874