Java implementation deletes duplicate files based on file hash value and file size

foreword

Among the files received by WeChat, there are often duplicate files, which take up a lot of disk space after a long time. Then it is too troublesome to manually find and delete duplicate files, so I still write a tool to help us do this. The code is also very simple, the main thing is to get the hash value of the file.

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;

/**
 * 清除指定文件夹下的重复文件
 */
public class RemoveDuplicateFile {
    
    

    /** 用Map来存储文件哈希值和文件大小的组合,以便判断是否为重复文件 */
    private static Map<String, Long> fileMap = new HashMap<>();

    /**
     * 获取文件哈希值
     */
    private static String getFileHash(File file) {
    
    
        try {
    
    
            MessageDigest md = MessageDigest.getInstance("MD5");
            FileInputStream fis = new FileInputStream(file);
            byte[] buffer = new byte[1024];
            int length;
            while ((length = fis.read(buffer)) != -1) {
    
    
                md.update(buffer, 0, length);
            }
            fis.close();
            byte[] digest = md.digest();
            StringBuilder sb = new StringBuilder();
            for (byte b : digest) {
    
    
                sb.append(String.format("%02x", b));
            }
            return sb.toString();
        } catch (NoSuchAlgorithmException | IOException e) {
    
    
            e.printStackTrace();
            return null;
        }
    }

    /**
     * 根据文件的哈希值和文件大小,判断文件是否为重复文件
     * @param folderPath 文件夹路径
     */
    public static void removeDuplicateFile(String folderPath){
    
    
        // 获取文件夹下的所有文件
        File folder = new File(folderPath);
        File[] files = folder.listFiles();
        for (File file : files) {
    
    
            if (file.isFile()) {
    
    
                String fileHash = getFileHash(file);
                long fileSize = file.length();
                String key = fileHash + "_" + fileSize;
                if (fileMap.containsKey(key)) {
    
    
                    // 如果已经存在相同哈希值和文件大小的文件,则删除当前文件
                    file.delete();
                    System.out.println("文件 " + file.getName() + " 是重复文件!已经删除!");
                } else {
    
    
                    fileMap.put(key, fileSize);
                }
            }else {
    
    
                // 递归子文件或文件夹
                removeDuplicateFile(file.getAbsolutePath());
            }
        }
    }

    public static void main(String[] args) {
    
    
        System.out.println("请输入要清除重复文件的文件夹路径:");
        Scanner sc = new Scanner(System.in);
        String folderPath = sc.next();
        removeDuplicateFile(folderPath);
    }
}

I packaged the code as an exe program for easy use.

Guess you like

Origin blog.csdn.net/weixin_43165220/article/details/130898737