The use of Python glob and the sorting problem of glob.glob

Introduction:

     glob is a related module for operating files that comes with python. Since the module has relatively few functions, it is easy to grasp. Use it to find file pathnames that match certain rules. Use this module to find files, only need to use: "*", "?", "[]" these three matching characters

  1. An asterisk "*" matches 0 or more characters

  2. A question mark "?" matches any single character

  3. "[]" matches a specific character within the specified range, such as: [0-9] matches numbers in the range, [az] and [AZ] matches letters in the range

1. glob.glob:

     Returns a list of all matching file paths. It has only one parameter pathname, which defines the file path matching rules, where it can be an absolute path or a relative path.

1. Wildcards

An asterisk "*" matches 0 or more characters

import glob
path = glob.glob('C:/Users/huangzh/Desktop/dir/*.txt')
print(path)
"""
结果:
[C:/Users/huangzh/Desktop/dir\\file.txt, C:/Users/huangzh/Desktop/dir\\file1.txt, C:/Users/huangzh/Desktop/dir\\file2.txt, C:/Users/huangzh/Desktop/dir\\fileA.txt, C:/Users/huangzh/Desktop/dir\\fileB.txt]
"""

Parent directories can also be replaced by asterisks

import glob
path = glob.glob('C:/Users/huangzh/Desktop/*/*.*')
for file in path:
    print(file)
"""
结果:
C:/Users/huangzh/Desktop\dir\file.txt
C:/Users/huangzh/Desktop\dir\file1.txt
C:/Users/huangzh/Desktop\dir\file2.txt
C:/Users/huangzh/Desktop\dir\fileA.txt
C:/Users/huangzh/Desktop\dir\fileB.txt
"""

2. Single character wildcard

A question mark "?" matches any single character

import glob
path = glob.glob('C:/Users/huangzh/Desktop/dir/file?.txt')
for file in path:
    print(file)
"""
结果:
C:/Users/huangzh/Desktop/dir\file1.txt
C:/Users/huangzh/Desktop/dir\file2.txt
C:/Users/huangzh/Desktop/dir\fileA.txt
C:/Users/huangzh/Desktop/dir\fileB.txt
"""

3. Character range

"[]" matches a specific character within the specified range, such as: [0-9] matches numbers in the range, [az] and [AZ] matches letters in the range

import glob
path = glob.glob('C:/Users/huangzh/Desktop/dir/file[0-9].txt')
for file in path:
    print(file)
"""
结果:
C:/Users/huangzh/Desktop/dir\file1.txt
C:/Users/huangzh/Desktop/dir\file2.txt
"""

path = glob.glob('C:/Users/huangzh/Desktop/dir/file[A-Z].txt')
for file in path:
    print(file)
"""
结果:
C:/Users/huangzh/Desktop/dir\fileA.txt
C:/Users/huangzh/Desktop/dir\fileB.txt
"""

use together

import glob
path = glob.glob('C:/Users/huangzh/Desktop/dir/*?.t[a-z]t')
for file in path:
    print(file)
"""
结果:
C:/Users/huangzh/Desktop/dir\file.txt
C:/Users/huangzh/Desktop/dir\file1.txt
C:/Users/huangzh/Desktop/dir\file2.txt
C:/Users/huangzh/Desktop/dir\fileA.txt
C:/Users/huangzh/Desktop/dir\fileB.txt
"""

Second, the sorting problem of glob.glob

The orderly generated files glob.glob are sorted like this:

import glob
path = glob.glob('C:/Users/huangzh/Desktop/dir/*.txt')
for file in path:
    print(file)
"""
结果:
C:/Users/huangzh/Desktop/dir\file1.txt
C:/Users/huangzh/Desktop/dir\file10.txt
C:/Users/huangzh/Desktop/dir\file100.txt
C:/Users/huangzh/Desktop/dir\file1000.txt
C:/Users/huangzh/Desktop/dir\file2.txt
C:/Users/huangzh/Desktop/dir\file3.txt
"""

Obviously this is not the ideal order and can even affect the result

Therefore, sorted can be used to sort

1. Sort by generation time:

import glob
import os
path = glob.glob('C:/Users/huangzh/Desktop/dir/*.txt')
print(sorted(path, key = os.path.getctime))
"""
结果:
['C:/Users/huangzh/Desktop/dir\\file1.txt', 'C:/Users/huangzh/Desktop/dir\\file2.txt', 'C:/Users/huangzh/Desktop/dir\\file3.txt', 'C:/Users/huangzh/Desktop/dir\\file10.txt', 'C:/Users/huangzh/Desktop/dir\\file100.txt', 'C:/Users/huangzh/Desktop/dir\\file1000.txt']
"""

2. Sort by size:

import glob
import os
path = glob.glob('C:/Users/huangzh/Desktop/dir/*.txt')
sorted(path, key = os.path.getsize)

 

Guess you like

Origin blog.csdn.net/weixin_41611054/article/details/102708817