C# compares whether the contents of two files are the same

When I saw the comparison file of the project today, I read all the bytes first, and then compared them one by one to find out if there is any place that can be optimized. I looked it up on the Internet, and there are comparisons of hash codes. After verification, I found that whether the file is different at the beginning or the end of the file is different, it is time-consuming.

C# compares whether the contents of two text files are equal - five points - Blog Park (cnblogs.com)

Test file size: 10000KB

1. Comparing hash codes

First convert the file content into a hash code, and then compare.

    public static bool CompareFile(string sourceFilePath, string destFilePath)
    {
        if (string.IsNullOrEmpty(sourceFilePath) || string.IsNullOrEmpty(destFilePath))
        {
            return false;
        }
        if (!File.Exists(sourceFilePath)|| !File.Exists(destFilePath))
        {
            return false;
        }
        using (HashAlgorithm hash = HashAlgorithm.Create())
        {
            using (FileStream file1 = new FileStream(sourceFilePath, FileMode.Open), file2 = new FileStream(destFilePath, FileMode.Open))
            {
                byte[] hashByte1 = hash.ComputeHash(file1);//哈希算法根据文本得到哈希码的字节数组
                byte[] hashByte2 = hash.ComputeHash(file2);
                string str1 = BitConverter.ToString(hashByte1);//将字节数组装换为字符串
                string str2 = BitConverter.ToString(hashByte2);
                return str2 == str1;
            }
        }
        return false;
    }

Add test code to the two hash code conversion methods. It is found that it is time-consuming.

 

 Two, character by character comparison

  public static bool CompareFile(string sourceFilePath, string destFilePath)
    {
        if (string.IsNullOrEmpty(sourceFilePath) || string.IsNullOrEmpty(destFilePath))
        {
            return false;
        }
        if (!File.Exists(sourceFilePath) || !File.Exists(destFilePath))
        {
            return false;
        }
        byte[] _source = File.ReadAllBytes(sourceFilePath);
        byte[] _dest = File.ReadAllBytes(destFilePath);
        if (_source.Length != _dest.Length)
        {
            return false;
        }
        for (int i = 0; i < _source.Length; ++i)
        {
            if (_source[i] != _dest[i])
            {
                return false;
            }
        }
        return true;
    }

 Add test code to the character-by-character comparison function.

If the files are not the same at the beginning, the time-consuming comparison is approximately equal to the time-consuming of reading the files.

If the end of the file is different, it is time-consuming and much faster than converting the hash code first.

Summarize:

The currently tested file is a 10000KB stored digital text, and other larger files have not been tested. It is more efficient to read the binary data of the file first and compare them one by one than to convert the file into a hash code for comparison.

Guess you like

Origin blog.csdn.net/qq_33461689/article/details/121499130