Table of contents
The following are some crawler projects suitable for beginners. The codes of these projects are relatively simple and easy to understand, which can help you get started with crawler development:
Scrapy Tutorial:
Scrapy is a Python crawler framework. This project provides some sample codes and documents that can help you learn how to write crawlers using the Scrapy framework.
The link is as follows: https://github.com/Python3WebSpider/ScrapyTutorial
Python crawler combat:
This is a Github project that contains several actual crawler projects, including sample codes and documents for crawling data from websites such as Douban Movies, Netease Cloud Music, 58.com, and performing data analysis.
Link: https://github.com/wistbean/learn_python3_spider
Python crawler case:
This is a Github project that contains multiple crawler cases, including sample codes and documents for crawling data from Douban Movies, Zhihu, Baidu Tieba and other websites, and performing data analysis.
Python crawler study notes:
This is a Github project that contains multiple crawler learning notes and sample codes, including sample codes and documents for web crawling using the Requests library, BeautifulSoup library, Selenium library and other tools.
Scrapy:
Scrapy is a fast, high-level web scraping and web scraping framework for crawling websites and extracting structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
链接:GitHub - scrapy/scrapy: Scrapy, a fast high-level web crawling & scraping framework for Python.
Example-of-web-crowlers:
Some very interesting examples of python crawlers are friendly to novices, mainly crawling websites such as Taobao, Tmall, WeChat, WeChat Reading, Douban, QQ and so on. Some common examples of website crawlers have higher code versatility and longer timeliness. The project code is more friendly to novices , try to use simple python code with a lot of comments.
These projects are all open source, you can view the code and documentation directly on Github, and learn how to write crawlers in Python. Of course, be sure to abide by the site's rules and terms and follow good ethics when using these codes.