Educode--Crawling Experiment of Forbidden City Wallpaper Webpage Pictures

Level 1: Web crawler multiple choice questions

Level 2: Obtain the information of the first picture on the Forbidden City wallpaper page

Implementation code:

import requests

from bs4 import   BeautifulSoup

url = "https://www.dpm.org.cn/lights/royal/p/1.html"

# code start

r = requests.get(url)

r.encoding = "utf-8"

soup = BeautifulSoup(r.text,"html.parser")

pic=soup.find("div",class_="pic")

x=pic.find("img")

imglj=x.attrs["src"]

imgmz=x.attrs["title"].strip()

# end of code

print(imglj)

print(imgmz)

f1=open("tpxx.txt","w")

f1.write(imglj+"\n")

f1.write(imgmz+"\n")

f1.close()

Level 3: Download the first image from the Forbidden City wallpaper page

Implementation code: 

import requests

from bs4 import   BeautifulSoup

url = "https://www.dpm.org.cn/lights/royal/p/1.html"

r=requests.get(url)

r.encoding = 'utf-8'

soup=BeautifulSoup(r.text,"html.parser")

pic=soup.find("div", class_="pic")

x=pic.find("img")

imglj=x.attrs["src"]

imgmz=x.attrs["title"].strip()

# code start

r=requests.get(imglj)  

cpmc="image//"+imgmz+".jpg"  

f1=open(cpmc,"bw")  

f1.write(r.content)  

f1.close()  

# end of code

Level 4: Download multiple images from a single page of the Forbidden City wallpaper page

Implementation code:

import requests

from bs4 import   BeautifulSoup

url = "https://www.dpm.org.cn/lights/royal/p/1.html"

# code start

r=requests.get(url)  

r.encoding = 'utf-8'  

soup=BeautifulSoup(r.text,"html.parser")  

pics=soup.find_all("div", class_="pic")  

i=1  

for pic in pics:  

    x=pic.find("img")  

    imglj=x.attrs["src"]  

    imgmz=x.attrs["title"].strip()  

    r=requests.get(imglj)  

    f=open("image/"+str(i)+imgmz+".jpg","wb")  

    i=i+1  

    f.write(r.content)  

    f.close()

# end of code

 

Level 5: Obtain the information of multiple pictures of the first five pages of the Forbidden City wallpaper

Implementation code: 

import requests

from bs4 import   BeautifulSoup

# code start

i=1  

for j in range(1,6):      

    url = "https://www.dpm.org.cn/lights/royal/p/"+str(j)+".html"  

    r=requests.get(url)  

    r.encoding = 'utf-8'  

    soup=BeautifulSoup(r.text,"html.parser")  

    pics=soup.find_all("div", class_="pic")  

    for pic in pics:  

        x=pic.find("img")  

        imglj=x.attrs["src"]  

        imgmz=x.attrs["title"].strip()  

        r=requests.get(imglj)  

        f1=open("image/"+str(i)+imgmz+".jpg",'wb')  

        f1.write(r.content)  

        f1.close()  

        i=i+1

# end of code

 

Supongo que te gusta

Origin blog.csdn.net/qq_57409899/article/details/124849738
Recomendado
Clasificación