Summary of JAVA IO common operations

1. Concepts and principles

Java's core library java.io provides a comprehensive IO interface. Including: file reading and writing, standard equipment output, etc. IO in Java is stream-based for input and output, all data is serialized into the output stream, or read from the input stream. Java IO provides system input and output through data streaming, serialization, and file system.

1.1 What is a stream?

Stream is an abstract concept, an abstraction of input and output devices. In Java programs, data input/output operations are performed in a "stream" manner. The device can be a file, network, memory, etc.

The flow is directional. As for input flow or output flow, it is a relative concept. Generally speaking, the program is used as a reference. If the flow of data is from program to device, it is called output flow, and vice versa is called input flow.

1.2 The working mechanism of disk IO

Because the disk device is managed by the operating system, the application program can only access the physical device by means of system calls. Reading and writing correspond to two system calls, read() and write(), respectively. The system call has the problem of switching between kernel space address and user space address. The following are several ways of disk IO to access files:

  • The standard way of accessing files is
    read() -> kernel cache -> read the disk if not, and then cache it in the system
    write() -> kernel cache -> For the user, the write operation has been completed. As for when to write to the disk Determined by the operating system, unless sync is explicitly called
  • Direct I/O method
    Direct I/O means that the application program directly accesses the disk data without passing through multiple kernel data buffers. The purpose is to reduce one copy of data from the kernel buffer to the user program cache. But every time I read the disk, it is very slow. Usually direct I/O combined with asynchronous I/O will get better performance
  • Synchronous access to files.
    Both read and write are synchronous operations. The difference from standard access is that only when the data is successfully written to the disk, the success flag is returned to the application
  • Asynchronous access to files
    Asynchronous access means that after a request is issued, the thread will continue to process other things instead of blocking and waiting. When the requested data returns, continue to process the following operations.
  • Memory mapping method The
    memory mapping method refers to the operating system associating a certain area in the memory with the file in the disk, and when it wants to access a piece of data in the memory, it is converted to a certain piece of data in the file.

2. Classification and Objects of IO Streams


Streams can be classified from different perspectives: 1. Different data units to be processed can be divided into: character stream, byte stream
2. Different data stream directions, can be divided into: input stream, output stream
3. Different functions, can be divided For: node flow, processing flow

1 and 2 are relatively easy to understand. For the classification based on functions, you can understand this:
node flow : node flow reads and writes data from a specific data source. That is, node streams are streams that directly manipulate files, networks, etc., such as FileInputStream and FileOutputStream. They directly read from files or write byte streams to files.

Processing flow : "Connecting" to an existing flow (node ​​flow or processing flow) provides programs with more powerful read and write functions through data processing. The filtering flow is created by using an existing input flow or output flow connection. The filtering flow is a series of packaging of the node flow. For example, BufferedInputStream and BufferedOutputStream are constructed using existing node streams to provide buffered read and write, which improves the efficiency of reading and writing, and DataInputStream and DataOutputStream are constructed using existing node streams to provide basic reading and writing in Java The function of the data type. They are all filtered streams.

2.1. IO flow classification

The following mainly discusses the stream types divided by the processed data unit, that is, byte stream and character stream.

Note:
1 character = 2 bytes, 1 byte (byte) = 8 bits (bit), a Chinese character occupies two bytes in length.
Byte stream: read (write out) one byte each time, when the resource is transmitted When the file has Chinese, there will be garbled
characters , character stream: read (write) two bytes each time, when there is Chinese, use this stream to correctly transmit and display Chinese.

Java byte character stream class diagram:
Insert picture description here

2.2 IO stream object

Common types of processing streams are:

  • Buffered stream The
    buffered stream should be "socketed" on the corresponding node stream to provide a buffer function for the read and write data, improve the read and write efficiency, and add some new methods.

    The byte buffered stream has BufferedInputStream/BufferedOutputStream, the character buffered stream has BufferedReader/BufferedWriter, and the character buffered stream provides ReadLine and NewLine methods to read and write a line respectively.

    For the output buffer stream, the written data will be written to the memory first, and then the data in the memory will be flushed to the hard disk using the flush method. Therefore, when using a character buffer stream, you must first flush and then close to avoid data loss.

  • The conversion stream is
    used to convert byte data to character data.

    Only the character stream InputStreamReader/OutputStreamWriter. Among them, InputStreamReader needs to be "socketed" with InputStream, and OutputStreamWriter needs to be "socketed" with OutputStream.

  • Data stream
    provides the function of reading and writing basic data types in Java.

    DataInputStream and DataOutputStream inherit from InputStream and OutputStream respectively, and need to be "socketed" on top of the node streams of InputStream and OutputStream type.

  • Object stream is
    used to write objects directly.

    The stream class has ObjectInputStream and ObjectOutputStream. These two methods are nothing, but the object to be written has requirements. The object must implement the Serializable interface to declare that it can be serialized. Otherwise, the object stream cannot be used for reading and writing.

    There is also a more important keyword, transient, which is used to modify the attributes in the class that implements the Serializable interface. The attributes modified by this modifier will be ignored when output in the form of an object stream.

Precautions for using Java IO streams

  • The byte stream has no buffer and is output directly, while the character stream needs to be output to the buffer first. Therefore, when outputting, the byte stream does not call the colse() method, the information has been output, and the character stream is only output when the close() method is called to close the buffer. To output information when the character stream is not closed, you need to manually call the flush() method.
  • All files on the hard disk are stored in bytes, and characters are only formed in the memory. That is, only when processing plain text files, the character stream is preferred, and byte streams are used otherwise.
  • InputStreamReader/OutputStreamWriter is a bridge converter that converts byte stream to character stream and needs to be socketed with InputStream/OutputStream
  • Java Io Api design follows the decorator pattern
  • Need to pay attention to the corresponding exception capture processing when using IO in the java program

Guess you like

Origin blog.csdn.net/qq_37765808/article/details/109300668