Python--pictures are read again according to file size

Python-image to read according to file size

Recently, I was doing image processing tasks. After I
divided the pictures, I found that there are many the same,
but the order of the file list returned by python os.walk () will be different
. The code I wrote before needs to traverse all the pictures. The accuracy is very high but the efficiency is particularly low.
Then I found that the pictures in the folder are sorted by size and similar pictures will be adjacent
so that as long as the adjacent pictures are compared, you can move (meaning 1 and 2 comparison, 3 and 4 comparison, etc.) It is not necessary to traverse all of them. Although it cannot be 100% accurate, the speed is greatly improved! I am happy! I traversed all the six thousand pictures before and I ran for a day ...

Attached code

import os
import cv2
from skimage.measure import compare_ssim
import shutil
import random


def delete(filename1):
    os.remove(filename1)


def move(filename1,filename2):
     shutil.move(filename1,filename2)


if __name__ == '__main__':
    path = r'C:\Users\lenovo\Desktop\image_all\0'
    save_path_img = r'C:\Users\lenovo\Desktop\image_all\delete1'
    os.makedirs(save_path_img, exist_ok=True)
    fileMap = {}
    size = 0
    # 遍历filePath下的文件、文件夹(包括子目录)
    for parent, dirnames, filenames in os.walk(path):
        #for dirname in dirnames:
            # print('parent is %s, dirname is %s' % (parent, dirname))
        for filename in filenames:
            # print('parent is %s, filename is %s' % (parent, filename))
            # print('the full name of the file is %s' % os.path.join(parent, filename))
            size = os.path.getsize(os.path.join(parent, filename))
            fileMap.setdefault(os.path.join(parent, filename), size)
    filelist = sorted(fileMap.items(), key=lambda d: d[1], reverse=False)
    img_files= []
    for filename, size in filelist:
        img_files.append(filename)
        # print("filename is %s , and size is %d" % (filename, size))
    imgs_n = []
    num = []
    print(img_files)
    for currIndex, filename in enumerate(img_files):
        img = cv2.imread(img_files[currIndex])
        img1 = cv2.imread(img_files[currIndex + 1])
        ssim = compare_ssim(img, img1, multichannel=True)
        if ssim > 0.9:
            imgs_n.append(img_files[currIndex + 1])
            print(img_files[currIndex], img_files[currIndex + 1], ssim)
        else:
            print('small_ssim',img_files[currIndex], img_files[currIndex + 1], ssim)
        currIndex += 1
        if currIndex >= len(img_files)-1:
            break
    for image in imgs_n:
        move(image, save_path_img)
        # delete(image)

The code part has some blog posts but I do n’t know where I can see it. Please forgive the author. I can see the private letter. I attached the reference link.
Hee hee. This is my first article in CSDN. Hey
article link: https: //blog.csdn .net / weixin_42385606 / article / details / 104718533

Published 3 original articles · Likes0 · Visits 123

Guess you like

Origin blog.csdn.net/weixin_42385606/article/details/104718533