Reptile - use proxy

Use a proxy IP

A, requests using a proxy

  requests the agency needs to construct a dictionary, and then proxies by setting parameters can be.

import requests

proxy = '60.186.9.233'
proxies = {
    'http': 'http://' + proxy,
    'https': 'https://' + proxy
}
try:
    res = requests.get('http://httpbin.org/get', proxies=proxies)
    print(res.text)
except requests.exceptions.ConnectionError as e:
    print('error', e.args)

operation result:

{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.18.4"
  }, 
  "origin": "60.186.9.233", 
  "url": "https://httpbin.org/get"
}

  origin of its operating results are proxy IP, proxy settings explain success. If the proxy requires authentication, then the agent preceded by the user name password.

proxy = 'username:[email protected]'

Two, Selenium using a proxy

  Selenium can also set the proxy, one is a browser interface, Chrome, for example; the other is a headless browser to PhantomJS example.

Chrome browser settings

  To set the proxy by chrome_options, when the object was created Chrome can use chrome_options transfer parameters. Run the code will pop up Chrome browser, see the following results after the access connection.

# chrome代理设置
from selenium import webdriver

proxy = '60.186.9.233'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=http://' + proxy)
browser = webdriver.Chrome(chrome_options=chrome_options)
res = browser.get('http://httpbin.org/get')
{
  "args": {}, 
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", 
    "Accept-Encoding": "gzip, deflate", 
    "Accept-Language": "zh-CN,zh;q=0.9", 
    "Host": "httpbin.org", 
    "Upgrade-Insecure-Requests": "1", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
  }, 
  "origin": "60.186.9.233", 
  "url": "https://httpbin.org/get"
}

 

PhantomJS settings

  Service_args use the command-line parameter is defined as a list of some of the parameters passed at initialization time to PhantomJS it.

# PhantomJs代理设置
from selenium import webdriver

service_args = [
    '--proxy=60.186.9.233',
    '--proxy-type=http'
]
browser = webdriver.PhantomJS(service_args=service_args)
browser.get('http://httpbin.org/get')
print(browser.page_source)

operation result:

{
  "args": {}, 
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", 
    "Accept-Encoding": "gzip, deflate", 
    "Accept-Language": "zh-CN,zh;q=0.9", 
    "Host": "httpbin.org", 
    "Upgrade-Insecure-Requests": "1", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
  }, 
  "origin": "60.186.9.233", 
  "url": "https://httpbin.org/get"
}

If authentication is required, then added --proxy-auth option in the service_args parameter.

service_args = [
    '--proxy=60.186.9.233',
    '--proxy-type=http',
    '--proxy-auth=username:password'
]

 

Guess you like

Origin www.cnblogs.com/zivli/p/11060183.html