本文地址:https://goodgoodstudy.blog.csdn.net/article/details/108585966
from bs4 import BeautifulSoup
bs = BeautifulSoup(html)
col = bs.find('div', {
'class':'col'})
col.findAll('a')
"""
[<a href="/paper/2020">Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)</a>,
<a class="btn btn-light btn-sm btn-spacer disabled" download="" href="/paper/2020/file/ec7f346604f518906d35ef0492709f78-Bibtex.bib">Bibtex »</a>,
<a class="btn btn-light btn-sm btn-spacer" href="/paper/2020/file/ec7f346604f518906d35ef0492709f78-Metadata.json">Metadata »</a>,
<a class="btn btn-light btn-sm btn-spacer" href="/paper/2020/file/ec7f346604f518906d35ef0492709f78-Paper.pdf">Paper »</a>,
<a class="btn btn-light btn-sm btn-spacer" href="/paper/2020/file/ec7f346604f518906d35ef0492709f78-Supplemental.pdf">Supplemental »</a>]
"""
现在需要找得是 text 部分含有 supplement 的 a 标签
import re
col.findAll('a',text= re.compile('Supplemental.*'))
"""
[<a class="btn btn-light btn-sm btn-spacer" href="/paper/2020/file/ec7f346604f518906d35ef0492709f78-Supplemental.pdf">Supplemental »</a>]
"""
成功!