Repeated loading and saving the same image to file system alters data of image

DragonHawk :

Repeatedly saving and loading an identical image from the file system leads to changed data and thus to a changed hash sum (which I need).

My program performs the following steps:

1. Create a BufferedImage

BufferedImage bufferedImage = new BufferedImage(400, 400, BufferedImage.TYPE_INT_RGB);
Graphics2D graphics = bufferedImage.createGraphics();
graphics.setColor(Color.RED);
graphics.fillRect(100, 100, 200, 200);
graphics.dispose();

2. Calculate MD5 hash of the created BufferedImage

ByteArrayOutputStream baos = new ByteArrayOutputStream();
ImageIO.write(bufferedImage, "jpg", baos);
byte[] bytesOfImage = baos.toByteArray();
DigestUtils.md5Hex(bytesOfImage); // => bebc7da469524057926f3871bdb07a6a

3. Save BufferedImage to file system

Path tempFile = Files.createTempFile(null, "jpg");
ImageIO.write(bufferedImage, "jpg", tempFile.toFile());

4. Calculating MD5 hash of file

byte[] bytesOfFile = Files.readAllBytes(tempFile);
DigestUtils.md5Hex(bytesOfFile); // => bebc7da469524057926f3871bdb07a6a

5. Load image from file system

BufferedImage bufferedImageFromFilesystem = ImageIO.read(tempFile.toFile());

6. Calculate MD5 hash of image loaded from file system

ByteArrayOutputStream baosFS = new ByteArrayOutputStream();
ImageIO.write(bufferedImageFromFilesystem, "jpg", baosFS);
byte[] bytesOfImageFromFilesystem = baosFS.toByteArray();
DigestUtils.md5Hex(bytesOfImageFromFilesystem); // => 11dc0e49342a1ad15ab1b5a7f8bc271e

(Repeat steps 3 to 6 but re-use image from step 5:)
7. Store BufferedImage to filesystem

Path tempFile2 = Files.createTempFile(null, "jpg");
ImageIO.write(bufferedImageFromFilesystem, "jpg", tempFile2.toFile());

8. Calculate MD5 hash of file

byte[] bytesOfFile2 = Files.readAllBytes(tempFile2);
DigestUtils.md5Hex(bytesOfFile2);// => 11dc0e49342a1ad15ab1b5a7f8bc271e

9. Load image from file system

BufferedImage bufferedImageFromFilesystem2 = ImageIO.read(tempFile2.toFile());

10. Calculate MD5 hash of image loaded from file system

ByteArrayOutputStream baosFS2 = new ByteArrayOutputStream();
ImageIO.write(bufferedImageFromFilesystem2, "jpg", baosFS2);
byte[] bytesOfImageFromFilesystem2 = baosFS2.toByteArray();
DigestUtils.md5Hex(bytesOfImageFromFilesystem2); // => d1102e4b7efef384623cac915a21e1c2

(org.apache.commons.codec.digest.DigestUtils is used for MD5 calculation)

Every time I save the same image on the file system using the code snippet #3. and load the same image using the code snipped #5. from the file system, the image data gets altered. The size of the image shrinks by a few bytes. The image can still be opened by the standard windows image viewer and seems to be still valid.

I already checked whether or not the issue is caused by meta data of the image. Comparing the meta data of the jpg files with a proper program does not show any difference of the meta data.

How can I make sure that loading and saving an identical image does not change the file?

Software Engineer :

You're saving a jpeg, which is a lossy compressed image format, rather than the raw buffer. Lossy means that the process cannot be reversed because information is lost in the process. Saving it as a jpeg uses heuristics to compresses the byte array so as to reduce its size. So, when you load it back it results in a different byte array to the original, hence changed hash. Then you save it again, which again compresses it, leading again to a different hash when you load it. I suspect that if you did this a million times the image would become a single grey pixel and the hash would cease to change.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=379641&siteId=1