bs4笔记

1、网页输出乱码的解决办法

r= requests.get('https://www.baidu.com/')

r.encoding = 'gbk2312'

soup=BeautifulSoup(r.text,"html.parser")

来源:https://blog.csdn.net/w839687571/article/details/81414433

2、打开本地html的方法

import requests
from bs4 import BeautifulSoup
import io

path = '/Users/lucax/Desktop/素材/html/123.html'
htmlfile = io.open(path, 'r', encoding='utf-8')
htmlhandle = htmlfile.read()
soup = BeautifulSoup(htmlhandle, "html.parser")
print soup

猜你喜欢

转载自www.cnblogs.com/kaibindirver/p/11372186.html
BS4