The first to write posts, poor writing and writing unclear, please forgive me
Python3 play a lot of coding will encounter problems if directly coded pages to deal with the unknown, not utf8 format will be garbled, following describes a string encoding unknown converted to utf8 to avoid garbled way,
Can be used in many scenes in Python transcoding
Write your own reptile in extracted parts:
#请求网页并转网页编码 def getHtmlAndDealCode(url): #html=requests.get(url,verify=False) html = s.get(url,headers=header) code=html.encoding html=html.text html=html.encode(code) html=html.decode('utf-8') parser = 'html.parser' soup = BeautifulSoup(html ,parser) return soup
Principle is by encoding to obtain encoded string and then encoded by encode this solution, decode ( ' utf8 ' ) to convert the encoding utf8 encoding, and then the subsequent process may be performed
It is not a simple and practical ah