fake_useragent:

wedge

 

Reptile request in the request, very often, we need to add the request headers, otherwise the server will be considered illegal requests, thereby denying you access.

import requests
url = 'https://www.zhihu.com/question/315387406/answer/812734512'
response = requests.get(url=url)
print(response.status_code)  # 400

In addition request header is to add the most commonly used user-agentterms in this request disguised as a browser.

User Agent Chinese called user agent, referred to as UA, it is a special string head, so that the server can identify the operating system and the version used by the customer, CPU type, browser and version, browser rendering engine, browser language, browser plug-ins.

So how do the individual user-agentdoes, right? Eight Immortals recount, but to solve personal problems are generally hand!

import requests

url = 'https://www.zhihu.com/question/315387406/answer/812734512'
headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36"
}
response = requests.get(url=url, headers=headers)
print(response.status_code)  # 200

However, with the introduction fake_useragent, my mother no longer have to worry about ......

from fake_useragent import UserAgent
# 实例化 user-agent 对象
ua = UserAgent()

url = 'https://www.zhihu.com/question/315387406/answer/812734512'
headers = {"user-agent": ua.chrome} # 指定浏览器 user-agent # 或者可以这样写 # headers = {"user-agent": UserAgent().random} # 一步到位,随机生成一个 user-agent response = requests.get(url=url, headers=headers) print(response.status_code) # 200

About

 

What is fake_useragent?

In short, fake_useragentjust like your girlfriend, you can help us to generate flexible user-agent, freeing both hands.

install

pip install fake_useragent

update

pip install -U fake-useragent

View version

import fake_useragent
print(fake_useragent.VERSION)  # 0.1.11

Usage

 

Generate the specified browser user-agent

import fake_useragent

# 实例化 user-agent 对象
ua = fake_useragent.UserAgent()

# ua.ie
print(ua.ie)  # Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; chromeframe/13.0.782.215)

# ua.msie print(ua['Internet Explorer']) # Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727) # ua.opera print(ua.opera) # Opera/9.80 (Windows NT 6.1; U; en-US) Presto/2.7.62 Version/11.01 # ua.chrome print(ua.chrome) # Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.16 Safari/537.36 # ua.google print(ua['google chrome']) # Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36 # ua.firefox print(ua.firefox) # Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/21.0.1 # ua.ff print(ua.ff) # Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/29.0 # ua.safari print(ua.safari) # Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-TW) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5 

Randomly generated user-agent

import fake_useragent

# 实例化 user-agent 对象
ua = fake_useragent.UserAgent()
print(ua.random)  # Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36
print(ua.random)  # Mozilla/5.0 (compatible; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)
print(ua.random)  # Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36

Randomly generated every time a UA said, greatly enhancing the authenticity of the reptiles.

Other Uses

 

The remote user agent json file downloaded to the local

Since the fake_useragentdatabase maintenance of user-agent json file is online:

import fake_useragent
print(fake_useragent.settings.CACHE_SERVER)
'''
# 网址,其实是个json文件
https://fake-useragent.herokuapp.com/browsers/0.1.11
'''

Since it is online json file, then we can be downloaded to the local:

from fake_useragent import UserAgent, VERSION

location = './fake_useragent%s.json' % fake_useragent.VERSION
ua = UserAgent(path=location)

If the error fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached, re-run the code just fine.

You will find in the bin directory with the same level script files have a json file.

If you only want the new json file saved locally

from fake_useragent import UserAgent
ua = UserAgent()
ua.update()

If you do not want to cache database or file system is not writable

from fake_useragent import UserAgent
ua = UserAgent(cache=False)

If you do not want to use the hosted cache server

from fake_useragent import UserAgent
ua = UserAgent(use_cache_server=False)

Handling Exceptions

 

fake_useragent.errors.FakeUserAgentError: Maximum amount of retries

from fake_useragent import UserAgent
# 禁用服务器缓存: use_cache_server=False
headers = {"User-Agent": UserAgent(use_cache_server=False).chrome} response = requests.get(url=url, headers=headers) print(response.status_code) # 200

FakeUserAgentError(‘Maximum amount of retries reached’

from fake_useragent import UserAgent
# 法1 禁用服务器缓存: use_cache_server=False
headers = {"User-Agent": UserAgent(use_cache_server=False).chrome} # 法2 忽略ssl验证 headers = {"User-Agent": UserAgent(verify_ssl=False).chrome} # 法3 不缓存数据 headers = {"User-Agent": UserAgent(cache=False).chrome} response = requests.get(url=url, headers=headers) print(response.status_code) # 200

fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached

from fake_useragent import UserAgent, VERSION

location = './fake_useragent%s.json' % fake_useragent.VERSION
ua = UserAgent(path=location)

When I json file will be written to the local online, the urllib.error.URLError: <urlopen error timed out>error caused re-run just fine, local file download is complete.

wedge

 

Reptile request in the request, very often, we need to add the request headers, otherwise the server will be considered illegal requests, thereby denying you access.

import requests
url = 'https://www.zhihu.com/question/315387406/answer/812734512'
response = requests.get(url=url)
print(response.status_code)  # 400

In addition request header is to add the most commonly used user-agentterms in this request disguised as a browser.

User Agent Chinese called user agent, referred to as UA, it is a special string head, so that the server can identify the operating system and the version used by the customer, CPU type, browser and version, browser rendering engine, browser language, browser plug-ins.

So how do the individual user-agentdoes, right? Eight Immortals recount, but to solve personal problems are generally hand!

import requests

url = 'https://www.zhihu.com/question/315387406/answer/812734512'
headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36"
}
response = requests.get(url=url, headers=headers)
print(response.status_code)  # 200

However, with the introduction fake_useragent, my mother no longer have to worry about ......

from fake_useragent import UserAgent
# 实例化 user-agent 对象
ua = UserAgent()

url = 'https://www.zhihu.com/question/315387406/answer/812734512'
headers = {"user-agent": ua.chrome} # 指定浏览器 user-agent # 或者可以这样写 # headers = {"user-agent": UserAgent().random} # 一步到位,随机生成一个 user-agent response = requests.get(url=url, headers=headers) print(response.status_code) # 200

About

 

What is fake_useragent?

In short, fake_useragentjust like your girlfriend, you can help us to generate flexible user-agent, freeing both hands.

install

pip install fake_useragent

update

pip install -U fake-useragent

View version

import fake_useragent
print(fake_useragent.VERSION)  # 0.1.11

Usage

 

Generate the specified browser user-agent

import fake_useragent

# 实例化 user-agent 对象
ua = fake_useragent.UserAgent()

# ua.ie
print(ua.ie)  # Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; chromeframe/13.0.782.215)

# ua.msie print(ua['Internet Explorer']) # Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727) # ua.opera print(ua.opera) # Opera/9.80 (Windows NT 6.1; U; en-US) Presto/2.7.62 Version/11.01 # ua.chrome print(ua.chrome) # Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.16 Safari/537.36 # ua.google print(ua['google chrome']) # Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36 # ua.firefox print(ua.firefox) # Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/21.0.1 # ua.ff print(ua.ff) # Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/29.0 # ua.safari print(ua.safari) # Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-TW) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5 

Randomly generated user-agent

import fake_useragent

# 实例化 user-agent 对象
ua = fake_useragent.UserAgent()
print(ua.random)  # Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36
print(ua.random)  # Mozilla/5.0 (compatible; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)
print(ua.random)  # Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36

Randomly generated every time a UA said, greatly enhancing the authenticity of the reptiles.

Other Uses

 

The remote user agent json file downloaded to the local

Since the fake_useragentdatabase maintenance of user-agent json file is online:

import fake_useragent
print(fake_useragent.settings.CACHE_SERVER)
'''
# 网址,其实是个json文件
https://fake-useragent.herokuapp.com/browsers/0.1.11
'''

Since it is online json file, then we can be downloaded to the local:

from fake_useragent import UserAgent, VERSION

location = './fake_useragent%s.json' % fake_useragent.VERSION
ua = UserAgent(path=location)

If the error fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached, re-run the code just fine.

You will find in the bin directory with the same level script files have a json file.

If you only want the new json file saved locally

from fake_useragent import UserAgent
ua = UserAgent()
ua.update()

If you do not want to cache database or file system is not writable

from fake_useragent import UserAgent
ua = UserAgent(cache=False)

If you do not want to use the hosted cache server

from fake_useragent import UserAgent
ua = UserAgent(use_cache_server=False)

Handling Exceptions

 

fake_useragent.errors.FakeUserAgentError: Maximum amount of retries

from fake_useragent import UserAgent
# 禁用服务器缓存: use_cache_server=False
headers = {"User-Agent": UserAgent(use_cache_server=False).chrome} response = requests.get(url=url, headers=headers) print(response.status_code) # 200

FakeUserAgentError(‘Maximum amount of retries reached’

from fake_useragent import UserAgent
# 法1 禁用服务器缓存: use_cache_server=False
headers = {"User-Agent": UserAgent(use_cache_server=False).chrome} # 法2 忽略ssl验证 headers = {"User-Agent": UserAgent(verify_ssl=False).chrome} # 法3 不缓存数据 headers = {"User-Agent": UserAgent(cache=False).chrome} response = requests.get(url=url, headers=headers) print(response.status_code) # 200

fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached

from fake_useragent import UserAgent, VERSION

location = './fake_useragent%s.json' % fake_useragent.VERSION
ua = UserAgent(path=location)

When I json file will be written to the local online, the urllib.error.URLError: <urlopen error timed out>error caused re-run just fine, local file download is complete.

Guess you like

Origin www.cnblogs.com/zhang-da/p/12207392.html