Python scrapy framework realizes data collection of a certain brand

A brand data collection

collection requirements

Address: http://www.winshangdata.com/brandList

Requirements: Use the scrapy framework to collect data on this site, at least 5 categories to be captured, and the data volume requires more than 5000

Collection fields: title, creation time, store opening method, cooperation period, area requirements

web analysis

After entering the website, the page is as follows

insert image description here

insert image description here

Open f12 to switch to the network column, refresh the web page or click the next page to grab the request

insert image description here

Analyzing the returned json data found that only the title and area requirements we need can be obtained

insert image description here

insert image description here

So we need to enter the webpage details page for analysis. After entering the details page, we find that the remaining parameters we need are in the li tag of the webpage, so we can obtain them through xpath, etc., and the webpage jumps

Guess you like

Origin blog.csdn.net/m0_46467017/article/details/131984551