How to run `diff` on one compressed file and one uncompressed file? (bash or python)

O.rka :

I have some files and I want to see if the ones in one directory are the same files as the ones in the other directory. The problem is that they are gzipped in one of the directories. The only way to do this that I know of is to decompress all of them, run diff in bash, then compress the file again. There's ~200 files that are each about 5 GB so this is not an option I want to do if possible.

Is there another way to do this? Perhaps in Python (3)? I found this module: https://docs.python.org/3/library/filecmp.html

I'm not sure how I can compare a gzip file with a regular file since one will be read in as bytes and the other as unicode?

import gzip, filecmp

path_1 = "path/to/query_1.txt"
path_2 = "path/to/query_2.txt.gz"
Shawn :

In bash

diff path/to/query_1.txt <(zcat path/to/query_2.txt.gz)

<(command) is a command redirection that connects the enclosed command's standard output to a filename that can then be opened and read from in another process.

It's not understood by bare bones /bin/sh, but bash, zsh and ksh all understand it.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=397468&siteId=1