Element转HTML和HTML清洗标签

Others 2021-11-20 17:03:18 views: null

Element转HTML和HTML清洗标签

Element转html

from html.parser import HTMLParser
from lxml import etree, html
import requests

response = requests.get('https://www.baidu.com')
html_element = etree(response)
 html_text = html.tostring(html_element [0],encoding='utf-8').decode('utf-8')
print(tree3)

html清洗标签

from html.parser import HTMLParser
from lxml import etree, html
import requests
from lxml.html.clean import Cleaner

response = requests.get('https://www.baidu.com')
html_element = etree(response)

1. etree清洗标签
content = etree.tostring(html_element, pretty_print=True).decode('utf-8')
desc = html.unescape(content)
print(desc)

2. html清洗标签
cleaner = Cleaner()
cleaner.javascript = True
cleaner.style = True
result = cleaner.clean_html(html_element)

清洗标签的办法还有很多种(如:import html模块,re,xpath等等),挑一个最喜欢的就行

Guess you like

Origin blog.csdn.net/weixin_44388373/article/details/119205117

Element转HTML和HTML清洗标签

Element转HTML和HTML清洗标签

Element转HTML和HTML清洗标签

HTML的结构和基本标签

标签-----HTML

HTML—html标签

HTML基础标签＜!--...--＞与base

html基础标签 !DOCTYPE

表格与表单（html标签）

HTML-meta标签

js获取html标签内容和包含标签本身与内容的方法

HTML基础之head标签以及其下的标签和属性

HTML——element

html element

HTML HTML element basis ---

工具类：内容有大量的html标签，去除html标签

原生html的跳转标签属性

HTML学习笔记-语法标签等

HTML03（图像标签）

HTML07（表格标签）

HTML学习1-----标签

HTML MetaData (转页、Refresh)

HTML 转 Word API 接口

HTML 转 PDF API 接口

HTML5新增标签及列表标签、表格标签、表单标签

HTML常用标签之表格标签(合并单元格)

-Img HTML parsing element in

HTML element computer code

HTML thead element Precautions

Html element to show and hide

Recommended

Ranking

Solve the problem that the Chinese display of the drawing using python matplotlib under the Linux system is displayed as a box

[Conference Call for Papers] The 5th International Conference on Civil Engineering, Environmental Resources and Energy Materials (CCESEM 2023)

4-yl Fast Fourier Transform

DataGirdView interlaced display color

[Docker implements test deployment CI/CD----Dingding alarm after the build is successful (7)]

[Asp.net core series] 6 the complete structure of a project in actual combat

The new version of OpenCenterV3 management background

Why can box plots detect outliers and what is the principle?

[Switch] jumbo frame Introduction

C++对象移动（3）——移动构造函数

Daily

More

2024-05-05(0)

2024-05-04(18)

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)

2024-04-27(29)

2024-04-26(22)