[Scrapy Five Minutes on the Website] Crawler Target Arrangement and Data Preparation

Introduction

Many small buddies have done a lot of crawlers and found that they did not start to make reasonable planning when they were very embarrassed to organize or re-use and query later. In order to avoid this embarrassing situation, a lot of content should be prepared in advance, also for later management. Prepare the framework for construction.

Therefore, this chapter is very important. It is necessary to understand what this article does and why you will find that crawling a website takes minutes.

This method is very convenient when tens of thousands of pages are organized for deployment and management.

To learn about learning all Scrapy modules, please click the portal
[Scrapy 2.4.0 Article Directory] Source Code Analysis: All Configuration Directory Index

Organize goals

Regardless of whether it is for building a Django website or for other reasons, tidying up is very important. In order to automatically process the captured content into these columns through a python script, now build a table in Excel. Also for the convenience of sorting and sorting the content captured later.

I look at the Django-based resource management website

Guess you like

Origin blog.csdn.net/qq_20288327/article/details/113626985