Strategies for general web crawlers

Commonly used crawling strategies are: depth first strategy, breadth first strategy.
1) Depth-first strategy: The basic method is to follow the order of depth from low to high, and visit the next level of web links in turn until it can no longer go deep.
2) Breadth-first strategy: This strategy crawls pages according to the depth of the content directory of the webpage. Pages at a shallower directory level are crawled first.

Guess you like

Origin blog.csdn.net/weixin_55323026/article/details/115229414