Read html encounter illegal multibyte sequence
1. The first case: the replacement encoding
View page source code, find the charset, get the page encoding
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
2. The second case: although still replaced encoding error
Gb2312 will replace gb18030