Write logical thinking reptiles

First, define a class, and then define a logical approach run () writes the following ideas in order, and then completed by a small step for each method, run () method of each step which again call the corresponding methods.

1.url

  • We know the law and the url address gotta Number of pages: List structure url address
  • start_url, url to access the beginning, and then follow some other law iterate

2. The transmission request acquisition response

  • requests.get()
  • response.content.decode()

3. Extract data

  • It returns json string: json module
  • Html returns the string: lxml module with the extracted data xpath

4. Save

with open("文件名","a",encoding="utf-8") as f:
    f.write()

 

Guess you like

Origin www.cnblogs.com/-chenxs/p/11415757.html