Use python to easily capture Liu Haoran's photo

17 lines of python code to capture the photo of Liu Haoran's photo home

It is very easy to use python to crawl web information. Because it has many libraries to help us achieve what we want. The libraries used in this experiment are BeautifulSoup, requests and re in bs4. Among them, re comes with python, so we don't need to install it. The installation process for the other two is as follows:

#按住win+R,打开cmd,然后依次输入:
pip install bs4
pip install requests

If you crawl under Windows, you should also check whether you have lxml installed. If it is not installed, you can also install it directly with pip:

pip install lxml

After installing the library, you can start crawling Liu Haoran's photo.

  • First find the website of Liu Haoran wallpaper in Picture Home: http://www.tupianzj.com/mingxing/xiezhen/liuhaoran/
    From the above website, we can translate its information: http://picture home/star/photo/Liu Haoran
    So, if you want to grab other celebrity photos, just change the last of the URL!

  • Open the URL, right click, click "Inspect", and then you can see the source code of this webpage. Then analyze the source code and find that the pictures are stored in 1 in the following figure, and the storage formats of the pictures are as shown in 2 and 3 in the following figure:
    write picture description here

  • After finding the pattern, we can type the code:

#导入库
from bs4 import BeautifulSoup
import requests

#给定网址
URL = "http://www.tupianzj.com/mingxing/xiezhen/liuhaoran/"
#抓取该URL的内容
html = requests.get(URL).text
#解析html,并存放在soup中
soup = BeautifulSoup(html, 'lxml')
#找到上面说的图中1的位置,因为图片都在它之中
img_ul = soup.find_all('div', {"id": "main"})
#创建img文件夹来存放抓取到的图片
import os
os.makedirs('./img/',exist_ok=True)
#由上图的2、3可知道图片的具体位置是在’img src‘中,所以先把所有的img找出来,再一一访问
imgs = ul.find_all('img')
#一一访问图片并下载
for img in imgs:
    url = img['src']
    r = requests.get(url, stream=True)
    image_name = url.split('/')[-1]
    with open('./img/%s' % image_name, 'wb') as f:
        for chunk in r.iter_content(chunk_size=128):#以128字节大小存放
            f.write(chunk)
    print('Saved %s' % image_name)
Saved 9-1P31G623590-L.jpg
Saved 9-1P3131419500-L.jpg
Saved 9-1P3031414430-L.jpg
Saved 9-1P3021543180-L.jpg
Saved 9-1P3021123440-L.jpg
Saved 9-1P22G043450-L.jpg
Saved 9-1P1291JR50-L.jpg
Saved 9-1P1221131480-L.jpg
Saved 9-1P1051036070-L.jpg
Saved 9-1P1051001240-L.jpg
Saved 9-1G219115I70-L.jpg
Saved 9-1G1151100100-L.jpg
Saved 9-1G0301436130-L.jpg
Saved 9-1G0041543170-L.png
Saved 9-1FZ91523210-L.png
Saved 9-1FHG13P60-L.png
Saved 9-1F5201911020-L.jpg
Saved 16-1612191430140-L.jpg
Saved 16-160P11A0460-L.jpg
Saved 9-16062G41001227.jpg
Saved 16-1605301305090-L.jpg
Saved 16-16051Q442070-L.jpg
Saved 16-16051Q416050-L.jpg
Saved 16-1605161012270-L.jpg
Saved 16-1604131Z0240-L.jpg
Saved 9-16012Q120410-L.jpg
Saved 9-151224200S00-L.jpg

At this point, the pictures are captured, open the img folder and see:
write picture description here
Look, the pictures are all downloaded here. Just click on one:
write picture description here
the handsome Liu Haoran!

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324928146&siteId=291194637