Python crawler-proxy IP

Proxy: crack the IP anti-climbing mechanism.

What is a proxy:

  • Proxy server.

The role of the agent:

  • Break through the restrictions of your own IP access.
  • Hide your real IP

Proxy related websites:
- Fast Acting
Insert picture description here

Type of proxy ip:

  • http: applied to the URL corresponding to the http protocol
  • https: applied to the URL corresponding to the https protocol

Anonymity of proxy ip:

  • Transparent: The server knows that the request uses a proxy, and also knows the real IP corresponding to the request
  • Anonymous: know that a proxy is used, but don't know the real ip
  • Gao An: I don’t know the proxy is used, let alone the real ip

Application of agent in crawler:

import requests


url = 'http://ip.293.net'
headers = {
    
    
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    }
# page_text = requests.get(url=url, headers=headers).text
page_text = requests.get(url=url, headers=headers, proxies={
    
    "http":'51.91.122.208:80'}).text

with open('ip.html', 'w', encoding='utf-8') as fp:
    fp.write(page_text)






Insert picture description here

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44827418/article/details/113975188