Notes python reptiles (iv) extracting the crawlers -Beautiful Soup Base (2) based on the HTML content database traversal method bs4

1. HTML-based content library traversal methods bs4

 

 

 

(1) .contents Example

 

 

 

 

 

 

(2) node tag fathers

 

Traversing the uplink (4) of the tag tree (Parents)

 

Parallel (5) tag tree traversal

 

 

 

 

Note: son node labels may be NavigableString

 

 

Guess you like

Origin www.cnblogs.com/douzujun/p/12229160.html