Getting started with python crawler basics - using requests and BeautifulSoup

(This article is a little note and perception of learning reptiles by myself)

After the initial learning of python, you should already have an overall impression of concepts such as strings, lists, dictionaries, tuples, conditional statements, loop statements, etc., and finally you can start doing some small exercises to consolidate your knowledge points. Writing crawler exercises is perfect .

1. Web Basics

The essence of a crawler is to obtain the required information from a web page, and it is still necessary to have a little understanding of the knowledge of the web page. Baidu Encyclopedia's definition of HTML: HTML, Hypertext Markup Language, is a marking language. It includes a series of tags. Through these tags, the document format on the network can be unified, and the scattered Internet resources can be connected into a logical whole. HTML text is a descriptive text composed of HTML commands, which can explain text, graphics, animations, sounds, tables, links, etc.

Of course, web pages are not only HTML, it can only achieve static effects, and the web pages we often see also have CSS with beautifying styles and JavaScript for dynamic effects. Crawlers do not have high requirements for the front-end language, and it is enough to find the information they need to crawl. Of course, children's shoes crawlers with a front-end foundation will be more convenient.

Guess you like

Origin blog.csdn.net/qq_25439417/article/details/131898114
Recommended