Java memory map, the G easily handle large files

Memory-mapped file (Memory-mapped File), refers to a segment of the virtual memory map in a byte-by-file, such as the application processing file access main memory (but prior to the actual use of these data, but does not consume physical memory, there will be no disk read and write operations), this document literacy than direct orders of magnitude faster.

Explain a little virtual memory (obviously, not physical memory), which is a computer system memory management technology. Like the Yaofa applied as such that it has a continuous application of available memory, in reality, it is usually physical memory is divided into a plurality of fragments, when desired, be temporarily stored on an external part of disk storage data exchange.

The main use is to increase the memory-mapped file I / O performance, especially for large files. For small files, memory mapped files and could lead to a waste of space debris, as memory-mapped always align page boundary, the smallest unit is 4 KiB, a 5 KiB mapping file will take up 8 KiB of memory, but also wasted 3 KiB RAM.

java.nio package such that very simple memory map, wherein the core class called a MappedByteBuffer, literally mapped byte buffer.

01, read the file using MappedByteBuffer

Suppose now that there is a file named cmower.txt, the contents of which are:

Silence king, an interesting programmer

PS: hey, can not change blowing his own trumpet this foul problems, because the stolen articles were too afraid.

The file in /resourcethe directory, we can get to it through the following methods:

ClassLoader classLoader = Cmower.class.getClassLoader();
Path path = Paths.get(classLoader.getResource("cmower.txt").getPath());
复制代码

Path can either represent a directory, you can also indicate a file, just like File it - of course, is to replace the Path of File.

Then, get a channel from a file (channel, an abstraction of a disk file).

FileChannel fileChannel = FileChannel.open(path);
复制代码

Then, map FileChannel class method call from obtaining MappedByteBuffer channel, such extended ByteBuffer- provides basic method of operation of some memory-mapped file.

MappedByteBuffer mappedByteBuffer = fileChannel.map(mode, position, size);
复制代码

Explain a little three parameters map method.

1) mode for the file-mapping mode, divided into three types:

  • MapMode.READ_ONLY (read-only), any attempt to modify the buffer operation will result in an ReadOnlyBufferException exception.

  • MapMode.READ_WRITE (read / write), any changes to the buffer are written to the file at some point. Note that, other mappings of the same file program may not immediately see these changes, more than one program at the same time file mapping behavior depends on the operating system.

  • MapMode.PRIVATE (private), changes to the buffer will not be written to the file, any changes to the buffer for both private.

2) position as a starting position of the file mapping.

3) size the size of the area to be mapped must be non-negative, not more than Integer.MAX_VALUE.

Once the mapping files into memory buffer, we can put inside the data read into CharBuffer and printed out. Specific code examples are as follows.

CharBuffer charBuffer = null;
ClassLoader classLoader = Cmower.class.getClassLoader();
Path path = Paths.get(classLoader.getResource("cmower.txt").getPath());
try (FileChannel fileChannel = FileChannel.open(path)) {
    MappedByteBuffer mappedByteBuffer = fileChannel.map(MapMode.READ_ONLY, 0, fileChannel.size());
    
    if (mappedByteBuffer != null) {
        charBuffer = Charset.forName("UTF-8").decode(mappedByteBuffer);
    }
    
    System.out.println(charBuffer.toString());
} catch (IOException e) {
    e.printStackTrace();
}
复制代码

Since the decode()parameters of the method is MappedByteBuffer, which means that we are content files from memory rather than on disk read, so the speed is very fast.

02, write to the file using MappedByteBuffer

Suppose now following content should be written to a file named cmower1.txt.

Silence king, "Web-wide stack development Advanced Road" author

This file has not been created, planned on the classpath directory of the project.

 Path path = Paths.get("cmower1.txt");
复制代码

DETAILED below the position shown in FIG.

Then, create a channel file.

FileChannel fileChannel = FileChannel.open(path, StandardOpenOption.READ, StandardOpenOption.WRITE,
                StandardOpenOption.TRUNCATE_EXISTING)
复制代码

The method used is still open, but in addition, three parameters, the first two well understood, represents a file read (the READ), write (the WRITE); TRUNCATE_EXISTING third parameter mean if the file already exists and has file open WRITE operations to be performed, then it is truncated to a length of 0.

Then, still calls the map method FileChannel class MappedByteBuffer get from the channel.

 MappedByteBuffer mappedByteBuffer = fileChannel.map(MapMode.READ_WRITE, 0, 1024);
复制代码

This time, we adjust the mode MapMode.READ_WRITE, and specify the file size is 1024, that is, the size of 1KB. MappedByteBuffer then use the put () method to save CharBuffer contents to a file. Specific code examples are as follows.

CharBuffer charBuffer = CharBuffer.wrap("沉默王二,《Web全栈开发进阶之路》作者");

Path path = Paths.get("cmower1.txt");

try (FileChannel fileChannel = FileChannel.open(path, StandardOpenOption.READ, StandardOpenOption.WRITE,
        StandardOpenOption.TRUNCATE_EXISTING)) {
    MappedByteBuffer mappedByteBuffer = fileChannel.map(MapMode.READ_WRITE, 0, 1024);

    if (mappedByteBuffer != null) {
        mappedByteBuffer.put(Charset.forName("UTF-8").encode(charBuffer));
    }

} catch (IOException e) {
    e.printStackTrace();
}
复制代码

Open cmower1.txt can look at the content, confirming the expected contents have not written success.

03, MappedByteBuffer regret

It is said that the use of MappedByteBuffer in Java is a very troublesome and painful, mainly include:

1) the size of a map of the best limited to about 1.5G, repeat map virtual memory will increase pressure recovery and redistribution. In other words, if the file size is uncertain, then it is not very friendly.

2) the virtual memory is determined by the operating system when flushed to disk, this time is not easy to be programmed.

3) MappedByteBuffer recovered more bizarre manner.

Again, these three claims are reportedly being my limited ability, we can not confirm the accuracy of this statement, unfortunately.

04, the processing time is relatively file operations

Hey, friend, After reading the contents of the above, I think you must have a memory-mapped file for general understanding. But I believe that if you are a responsible programmer, you will want to know: read speed memory-mapped files exactly how fast.

To conclude, I called three other players competition: InputStream (common input stream), BufferedInputStream (input stream buffer zone), RandomAccessFile (random access files).

Object read is Pirates of the Caribbean 4 On Stranger Tides .mkv, size is 1.71G.

1) normal input stream

public static void inputStream(Path filename) {
    try (InputStream is = Files.newInputStream(filename)) {
        int c;
        while((c = is.read()) != -1) {
            
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}
复制代码

2) Enter Buffered stream

public static void bufferedInputStream(Path filename) {
    try (InputStream is = new BufferedInputStream(Files.newInputStream(filename))) {
        int c;
        while((c = is.read()) != -1) {
            
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}
复制代码

3) random access file

public static void randomAccessFile(Path filename) {
    try (RandomAccessFile randomAccessFile  = new RandomAccessFile(filename.toFile(), "r")) {
        for (long i = 0; i < randomAccessFile.length(); i++) {
            randomAccessFile.seek(i);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}
复制代码

4) Memory-Mapped Files

public static void mappedFile(Path filename) {
    try (FileChannel fileChannel = FileChannel.open(filename)) {
        long size = fileChannel.size();
        MappedByteBuffer mappedByteBuffer = fileChannel.map(MapMode.READ_ONLY, 0, size);
        for (int i = 0; i < size; i++) {
            mappedByteBuffer.get(i);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}
复制代码

The test program is very simple, as follows:

long start = System.currentTimeMillis();
bufferedInputStream(Paths.get("jialebi.mkv"));
long end = System.currentTimeMillis();
System.out.println(end-start);
复制代码

The results are shown below four players.

method time
Common input stream Turtle speed, there is no patience to wait for the results
Random access file Turtle speed, not have the patience to wait
Buffered input stream 29966
Memory-Mapped Files 914

Common input stream and random access files are slow as hell, really turtle speed, I do not have the patience to wait for the results; the input stream buffered performance is not bad, but compared to memory-mapped file is much more limited. It is concluded that: memory-mapped files, large files easily handle on G .

05, finally

This article introduces the Java memory-mapped files, MappedByteBuffer is its soul, read speeds as fast as a rocket. Further, all of these examples, and code segments can be found on GitHub - This is a Maven project, and so it is easy to import operation.

Welcome attention to "silence the king," public number, reply back keyword 'Java' to get Java commonly used algorithm Manual - become an essential handbook master.

Scan code concerns

Guess you like

Origin juejin.im/post/5d53c0e1f265da0390052545