This column of Python crawler will explain the principles and applications of crawlers in simple terms. Before starting to learn about Python crawlers, it is recommended to learn and understand the basic knowledge of Python . A related column has been set up for this blog, which can be accessed directly by clicking.
Content index:
- [Python crawler] 1. HTTP and HTTPS request and response of crawler principle
- [Python crawler] 2. Definition, classification, process and coding format of crawler principles
- [Python crawler] 3. Requests HTTP library for data capture
- [Python crawler] 4. HTTP/HTTPS capture tool Fiddler for data capture
- [Python crawler] 5. Regular expression re module for data extraction
- [Python crawler] 6. XPath and lxml library for data extraction
- [Python crawler] 7. JSON and JsonPATH for structured data extraction
- [Python crawler] 8. Selenium and PhantomJS for dynamic HTML processing
- [Python crawler] 9. Tesseract for machine vision and machine image recognition
- [Python crawler] Ten, Scrapy framework