Modify the file names in the following paths in batches.
Article directory
1. Read the file name in the specified path
#导入标准库
import os
#读取文件名
filesDir = "路径……"
fileNameList = os.listdir(filesDir)
#输出路径中的所有文件
for filename in fileNameList:
print(filename)
Second, the regular expression extracts the part that needs to be retained
1. Introduce the re library
Here you need to import the re library, which is mainly the standard library of Python and is mainly used for string matching.
function name |
effect |
return type |
re.findall() |
Search and return all matching substrings |
list type |
re.match() |
match from the beginning of the string |
match object |
re.search() |
The first position that matches the search |
match object |
re.split() |
Split the string according to the matching result |
list type |
re.finditer() |
The type of iteration where the search returns matching results |
match object |
re.sub() |
Replace all substrings that match the regular expression in a string, and return the replaced string |
string |
2. Usage of functions in re library
(1) re.findall() #most commonly used
The first parameter is the regular expression to match, and the second parameter is the string to search for.
import re
#搜索所有的五位数字
ls = re.findall(r'[1-9]\d{5}','BIT100081 TUS444567')
print(ls)
['100081', '444567']
(2)re.sub(pattern,repl,string,count)
parameter |
effect |
pattern |
regular expression |
repl |
Replace string with matching string |
string |
match string |
count |
Maximum number of replacements |
The specific usage of the re library is for readers to explore by themselves. Due to the complexity of the strings, the usage of regular expressions is different. For details, you can carefully study the official website or the rookie tutorial: Python3 regular expressions .
3. Examples
-
Take this example to illustrate that the same part of these file names has brackets, so first write the regular expression to match the brackets.
-
The compile method in the re library can return a regular expression interception rule. Where (.*?) means that any character appears any number of times.
# 正则表达式提取需要保留的部分,,,,主要匹配删除括号的内容
rules = re.compile(r'[(](.*?)[)]', re.S)
- This deletes the parentheses and the content in the parentheses.
- Remember that what is passed in the compile must be a string, and the preceding r is to prevent escape characters. Test for a match to avoid errors.
# 正则表达式提取需要保留的部分,,,,主要匹配删除括号的内容
rules = re.compile(r'[(](.*?)[)]', re.S)
#开始数组循环更改文件名
for filename in fileNameList:
print("旧的名字是:\t"+filename)
print("开始截取!")
newFilename = re.sub(rules,'',str(filename))
#输出保留的内容
print("新名字是:\t"+newFilename)
print("\n\n")
3. Officially change the file name
1. Use the rename method in the os library
os.rename(os.path.join(filesDir, filename), os.path.join(filesDir, newFilename))
2. Is the test correct?
Note: Different file names have different ways of using regular expressions, and specific problems should be analyzed in detail.
#开始数组循环更改文件名
for filename in fileNameList:
print("旧的名字是:\t"+filename)
print("开始截取!")
newFilename = re.sub(rules,'',str(filename))
#输出保留的内容
print("新名字是:\t"+newFilename)
print("开始改名。。。")
os.rename(os.path.join(filesDir, filename), os.path.join(filesDir, newFilename))
print("改名完毕!")
print("======================================================================================")
3. Other matching situations
(1) Match the serial number of the file name
#删除重复的文件名序号
rules = re.compile(r'^\d', re.S)
-
If the file name has two serial numbers and they are repeated, this function can be used. In this case, it cannot be used because the two serial numbers are not the same.
-
If the file does not exceed 0-9, you can run it once; but if it exceeds 10-99, you can run it again; if the serial number is three digits, you can run it a third time.
-
\d is equivalent to [0-9] .
(2) Match special characters
#删除文件名前的特殊字符,如果是"."删除".",如果是"#"删除"#"
rules = re.compile(r'^\.', re.S)
- Note: When matching, pay attention to the position of this special symbol; if it is at the beginning, press to add the ^ symbol; if it is in other positions, pay attention, if the point is deleted , pay attention to the point in the suffix name. At this time, it is necessary to delete the suffix at the beginning, and then add the specified suffix when the change is completed.
(3) Delete the specified part of the string
# 正则表达式删除指定的的部分
rules = re.compile(r'Swagger-[0-9][0-9]:', re.S)
- Note: For the part specified in the regular expression, it is recommended to copy and paste to avoid spaces in the middle.
(4) Add the suffix name in the last step
# 可以为文件名添加后缀,,,本质为在最后匹配的字符串最后增加一个部分,可以为数字,也可以为文件后缀
newFilename = newFilename + ".mp4"