Proxy: crack the IP anti-climbing mechanism.
What is a proxy:
- Proxy server.
The role of the agent:
- Break through the restrictions of your own IP access.
- Hide your real IP
Proxy related websites:
- Fast Acting
- Xici Agent
- www.goubanjia.com
- https://ip.jiangxianli.com/?page=1
Type of proxy ip:
- http: applied to the URL corresponding to the http protocol
- https: applied to the URL corresponding to the https protocol
Anonymity of proxy ip:
- Transparent: The server knows that the request uses a proxy, and also knows the real IP corresponding to the request
- Anonymous: know that a proxy is used, but don't know the real ip
- Gao An: I don’t know the proxy is used, let alone the real ip
Application of agent in crawler:
import requests
url = 'http://ip.293.net'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
}
# page_text = requests.get(url=url, headers=headers).text
page_text = requests.get(url=url, headers=headers, proxies={
"http":'51.91.122.208:80'}).text
with open('ip.html', 'w', encoding='utf-8') as fp:
fp.write(page_text)