python solve illegal multibyte sequence

Read html encounter illegal multibyte sequence

1. The first case: the replacement encoding

View page source code, find the charset, get the page encoding

<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />

2. The second case: although still replaced encoding error

Gb2312 will replace gb18030

Guess you like

Origin www.cnblogs.com/aliex/p/11365791.html