[Reprint] Python-Replace or remove characters that cannot be used in file names

When I was crawling today, after crawling 20 programs, I got stuck inexplicably. I thought it was blocked by the server. I also added a user-agent pool and randomly obtained user-agent to form headers. I didn't expect that there was a problem with the last file naming. An illegal character appears in the string used for naming. Find information on the Internet and construct a function to remove illegal characters in the string through regular expressions:

import re
 
def validateTitle(title):
    rstr = r"[\/\\\:\*\?\"\<\>\|]"  # '/ \ : * ? " < > |'
    new_title = re.sub(rstr, "_", title)  # 替换为下划线
    return new_title


Successfully solved the problem!

reference:

https://www.polarxiong.com/archives/Python-%E6%9B%BF%E6%8D%A2%E6%88%96%E5%8E%BB%E9%99%A4%E4%B8%8D%E8%83%BD%E7%94%A8%E4%BA%8E%E6%96%87%E4%BB%B6%E5%90%8D%E7%9A%84%E5%AD%97%E7%AC%A6.html


————————————————
Copyright Statement: This article is the original article of CSDN blogger "Burette_Lee", and it follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprinting .
Original link: https://blog.csdn.net/qq_29303759/article/details/81944733

Guess you like

Origin blog.csdn.net/u010472858/article/details/103459511