How do I decompress large files using Zstd-jni and Byte Buffers

Jay Askren :

I am trying to decompress a lot of 40 MB+ files as I download them in parallel using ByteBuffers and Channels. I am getting better throughput by using Channels than I do by using Streams and we need this to be a very high throughput system as we need to process 40 TB of files every day and this part of the process is currently the bottleneck. The files are compressed with zstd-jni. Zstd-jni has api's for decompressing byte buffers but I get an error when I use them. How do I decompress a byte buffer at a time using zstd-jni?

I found these examples in their tests, but unless I am missing something the examples using ByteBuffers seem to assume the entire input file fits in one ByteBuffer: https://github.com/luben/zstd-jni/blob/master/src/test/scala/Zstd.scala

Below is my code for compressing and decompressing files. The compression code works great, but the decompression code then fails with an error of -70.

public static long compressFile(String inFile, String outFolder, ByteBuffer inBuffer, ByteBuffer compressedBuffer, int compressionLevel) throws IOException {
    File file = new File(inFile);
    File outFile = new File(outFolder, file.getName() + ".zs");
    long numBytes = 0l;

    try (RandomAccessFile inRaFile = new RandomAccessFile(file, "r");
        RandomAccessFile outRaFile = new RandomAccessFile(outFile, "rw");
                FileChannel inChannel = inRaFile.getChannel();
                FileChannel outChannel = outRaFile.getChannel()) {
        inBuffer.clear();
        while(inChannel.read(inBuffer) > 0) {
            inBuffer.flip();
            compressedBuffer.clear();

            long compressedSize = Zstd.compressDirectByteBuffer(compressedBuffer, 0, compressedBuffer.capacity(), inBuffer, 0, inBuffer.limit(), compressionLevel);
            numBytes+=compressedSize;
            compressedBuffer.position((int)compressedSize);
            compressedBuffer.flip();
            outChannel.write(compressedBuffer);
            inBuffer.clear(); 
        }
    }

    return numBytes;
}

public static long decompressFile(String originalFilePath, String inFolder, ByteBuffer inBuffer, ByteBuffer decompressedBuffer) throws IOException {
    File outFile = new File(originalFilePath);
    File inFile = new File(inFolder, outFile.getName() + ".zs");
    outFile = new File(inFolder, outFile.getName());

    long numBytes = 0l;

    try (RandomAccessFile inRaFile = new RandomAccessFile(inFile, "r");
        RandomAccessFile outRaFile = new RandomAccessFile(outFile, "rw");
                FileChannel inChannel = inRaFile.getChannel();
                FileChannel outChannel = outRaFile.getChannel()) {

        inBuffer.clear();

        while(inChannel.read(inBuffer) > 0) {
            inBuffer.flip();
            decompressedBuffer.clear();
            long compressedSize = Zstd.decompressDirectByteBuffer(decompressedBuffer, 0, decompressedBuffer.capacity(), inBuffer, 0, inBuffer.limit());
            System.out.println(Zstd.isError(compressedSize) + " " + compressedSize);
            numBytes+=compressedSize;
            decompressedBuffer.position((int)compressedSize);
            decompressedBuffer.flip();
            outChannel.write(decompressedBuffer);
            inBuffer.clear(); 
        }
    }

    return numBytes;
}
karavelov :

Yes, the static methods you use in your example assume the whole compressed file fits in one ByteBuffer. As far as I understand your requirements, you need streaming decompression using ByteBuffers. ZstdDirectBufferDecompressingStream already provides this:

https://static.javadoc.io/com.github.luben/zstd-jni/1.3.7-1/com/github/luben/zstd/ZstdDirectBufferDecompressingStream.html

and here is an example how to use it (from the tests):

https://github.com/luben/zstd-jni/blob/master/src/test/scala/Zstd.scala#L261-L302

but you have also to subclass it and override the "refill" method.

EDIT: here is a new test I just added that has exactly the same structure as your question - moving data beteen channels:

https://github.com/luben/zstd-jni/blob/master/src/test/scala/Zstd.scala#L540-L586

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=80253&siteId=1
Recommended