Cat brother teach you to write reptile reptilian 031 basis -html

review

Browser principle

Reptile works

Can requests.get () to get the data online

HTML Review

HTML learning sequence is divided into three levels, it should be read, modify, write

Only read the HTML, in order to be able to understand the page structure, will it be possible to use other Python modules to parse the data and extract data

What is html

HTML (Hyper Text Markup Language) is a language used to describe web pages, also known as HTML

HTML is to the page like architectural drawings of the construction

Relations html, css, js of

HTML tags

Web page header and body

Attributes

id和class

id is a one to one relationship, class is one to many relationship

Small job: Get page source code ...

localprod.pandateacher.com/python-manu…

Small summary

import requests #调用requests库
from bs4 import BeautifulSoup
# 获取数据
res = requests.get('https://localprod.pandateacher.com/python-manuscript/crawler-html/spider-men5.0.html') 
# res.status_code 状态码
# res.content 二进制
# res.text html代码
# res.encoding 编码
# 解析数据
# soup 是beautifulsoup对象
soup = BeautifulSoup(res.text,'html.parser')
# soup.find(标签名,属性=属性值)
# soup.find_all(标签名, 属性=属性值)
# 提取数据 list 里面是tag对象
item = soup.find_all('div',class_='books')
for i in item:
    # i.find().find().find() # tag对象, 可以一级一级找下去
    # i.find_all()
    # i 是tag对象, 也可以使用find和find_all, 得到结果还是tag对象
    # i.find().find().find().find()
    print(i.find('a',class_='title').text) # 获取标签内容
    print(i.find('a',class_='title')['href']) # 获取标签属性(href)
    print(i.find('p',class_='info').text) # 获取标签内容
复制代码

Quick Jump:

Cat brother teach you to write reptile 000-- begins .md
cat brother teach you to write reptile 001 - print () functions and variables .md
cat brother teach you to write reptile 002-- job - Pikachu .md print
cat brother teach you to write reptiles 003 data type conversion .md
cat brother teach you to write reptile 004-- data type conversion - small practice .md
cat brother teach you to write reptile 005-- data type conversion - small jobs .md
cat brother teach you to write reptile 006- - conditional and nested conditions .md
cat brother teach you to write 007 reptile conditional and nested conditions - small operating .md
cat brother teach you to write reptile 008 - input () function .md
cat brother teach you to write reptiles 009 - input () function - AI little love students .md
cat brother teach you to write a list of 010 reptiles, dictionaries, circulation .md
cat brother teach you to write reptile 011-- lists, dictionaries, circulation - small jobs .md
cat brother teach you to write a Boolean value, and four reptile 012-- statements .md
cat brother teach you to write a Boolean value, and four reptile 013-- statements - smaller jobs .md
cat brother teach you to write reptile 014 - pk game. md
cat brother teach you to write reptile 015 - pk game (new revision) .md
cat brother teach you to write reptile 016-- function .md
cat brother teach you to write reptile 017-- function - a small job .md
cat brother to teach you write reptile 018--debug.md
cat brother teach you to write reptile 019 - debug- job. md
cat brother teach you to write reptiles 020-- Classes and Objects (on) .md
cat brother teach you to write reptiles 021-- Classes and Objects (a) - Job .md
Cat brother teach you to write reptiles 022-- Classes and Objects (lower) .md
cat brother teach you to write reptiles 023-- Classes and Objects (lower) - Job .md
cat brother teach you to write reptile 024-- decoding coded && .md
cat brother teach you to write reptile 025 && decoding coded - small jobs .md
cat brother teach you to write reptile 026-- module .md
cat brother teach you to write reptile 027-- module introduces .md
cat brother teach you to write reptile 028- - introduction module - small job - billboards .md
cat brother teach you to write Preliminary -requests.md reptile reptilian 029--
cat brother teach you to write reptile reptilian 030-- Preliminary -requests- job .md
cat brother teach you to write 031 reptiles - reptile basis -html.md
cat brother teach you to write reptile reptilian 032-- first experience -BeautifulSoup.md
cat brother teach you to write reptile reptilian 033-- first experience -BeautifulSoup- job .md
cat brother teach you to write reptile 034- - reptile -BeautifulSoup practice .md
cat brother teach you to write 035-- reptile reptilian -BeautifulSoup practice - job - film top250.md
cat brother teach you to write 036-- reptile reptilian -BeautifulSoup practice - work - work to resolve .md movie top250-
cat brother teach you to write 037-- reptile reptiles - to listen to songs .md baby
cat brother teach you to write reptile 038-- arguments request .md
cat brother teach you to write data stored reptile 039-- .md
cat brother teach you to write reptiles 040-- store data - Job .md
cat brother teach you to write reptile 041-- analog login -cookie.md
Cat brother teach you to write reptile 042 - session usage .md
cat brother teach you to write reptile 043-- analog browser .md
cat brother teach you to write reptile 044-- analog browser - job .md
cat brother teach you to write reptiles 045-- coroutine .md
cat brother teach you to write reptile 046-- coroutine - practice - what to eat not fat .md
cat brother teach you to write reptile 047 - scrapy framework .md
cat brother teach you to write reptile 048-- .md reptile reptiles and anti-
cat brother teach you to write reptile 049-- end Sahua .md

Reproduced in: https: //juejin.im/post/5cfc4ada6fb9a07ef63fcfd0

Guess you like

Origin blog.csdn.net/weixin_34367845/article/details/91416933