Want to learn Python crawler technology? Several beginner projects on GitHub

Table of contents

Scrapy Tutorial:

Python crawler combat:

Python crawler case:

Python crawler study notes:

Scrapy

Example-of-web-crowlers


The following are some crawler projects suitable for beginners. The codes of these projects are relatively simple and easy to understand, which can help you get started with crawler development:

Scrapy Tutorial:

Scrapy is a Python crawler framework. This project provides some sample codes and documents that can help you learn how to write crawlers using the Scrapy framework.

The link is as follows: https://github.com/Python3WebSpider/ScrapyTutorial

Python crawler combat:

This is a Github project that contains several actual crawler projects, including sample codes and documents for crawling data from websites such as Douban Movies, Netease Cloud Music, 58.com, and performing data analysis.

Link: https://github.com/wistbean/learn_python3_spider

Python crawler case:

This is a Github project that contains multiple crawler cases, including sample codes and documents for crawling data from Douban Movies, Zhihu, Baidu Tieba and other websites, and performing data analysis.

Link: GitHub - Largefreedom/python_zeroing-: Some small cases written in Python, involving crawlers and visualization, hope it will be helpful for Python beginners

Python crawler study notes:

This is a Github project that contains multiple crawler learning notes and sample codes, including sample codes and documents for web crawling using the Requests library, BeautifulSoup library, Selenium library and other tools.

Link: GitHub - ZhuoZhuoCrayon/pythonCrawler: python3 web crawler notes and actual combat source code. Record the notes, reference materials and common mistakes of the python crawler learning process, about 40 crawling examples and analysis of ideas, covering the use of common libraries such as urllib, requests, bs4, jsonpath, re, pytesseract, and PIL.

Scrapy:

Scrapy is a fast, high-level web scraping and web scraping framework for crawling websites and extracting structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

链接:GitHub - scrapy/scrapy: Scrapy, a fast high-level web crawling & scraping framework for Python.

Example-of-web-crowlers:

Some very interesting examples of python crawlers are friendly to novices, mainly crawling websites such as Taobao, Tmall, WeChat, WeChat Reading, Douban, QQ and so on. Some common examples of website crawlers have higher code versatility and longer timeliness. The project code is more friendly to novices , try to use simple python code with a lot of comments.

Link: GitHub - shengqiangzhang/examples-of-web-crawlers: Some very interesting examples of python crawlers, friendly to novices, mainly crawl Taobao, Tmall, WeChat, WeChat reading, Douban, QQ and other websites. (Some interesting examples of python crawlers that are friendly to beginners. )

These projects are all open source, you can view the code and documentation directly on Github, and learn how to write crawlers in Python. Of course, be sure to abide by the site's rules and terms and follow good ethics when using these codes.

Guess you like

Origin blog.csdn.net/weixin_46481662/article/details/130038335
Recommended