User-Agent is not only anti-reptile

User-Agent to bypass anti-reptile combat

User-Agent crawler is the server means to distinguish between normal and crawler via user verification request header User-Agent value, which is a more primary anti crawler means.

"""
User-Agent 反爬虫绕过实战
实例1.校园新闻网列表页User-Agent反爬虫
任务:爬取校园新闻网站页面右侧“本周热点”列表中的新闻标题
URL:http://www.porters.vip/verify/uas/index.html
"""

import requests
from parsel import Selector

url = 'http://www.porters.vip/verify/uas/index.html'

#向目标网站发起请求
resp = requests.get(url=url)
#打印输出状态码
print(resp.status_code)
#如果本次请求的状态码为200,则继续,否则提示失败
if resp.status_code == 200:
    sel = Selector(resp.text)
    #根据HTML标签和属性从响应正文中提取新闻标题
    res = sel.css('.list-group-item::text').extract()
    print(res)
else:
    print('This request is Fial !')

Here Insert Picture Description
The request did not succeed, but the browser can be opened normally, this is why? Is the site what the problem is, we can try Postman, Postman request the following results
html>

Guess you like

Origin blog.csdn.net/weixin_43870646/article/details/105117331