python reptile --- to achieve the project (three) Selenium analysis of the US food group

  On a blog, I wanted to crawl the US food group, but because of the request header is too complicated, there is no argument to crack open a few of them, so give up, this time we come to fetch data selenium mode browser, we first simply look at the process:

  1, the use of selenium drive browser, you get a list of food

  2, page analysis and gives a list of food subsequent page

  3, Data Extraction (pyquery)

Project: The US food group

Project Address: https://gitee.com/dwyui/pyQuery_selenium.git

As the US group pocketing serious, crawling only to part of the data, you can try several attempts to modify interval.

You can also try to use PhantomJS crawling data, and the original code almost unanimously themselves.

 

Guess you like

Origin www.cnblogs.com/cxiaocai/p/10962761.html