Python3 Web crawler: Selenium chrome configure the proxy version of the Python method

This article introduces Selenium chrome version of the Python proxy configuration methods, Xiao Bian feel very good, now for everyone to share, but also to be a reference. Follow Xiaobian to see it come together
environment: windows 7 + Python 3.5.2 + Selenium 3.4.2 + Chrome Driver 2.29 + Chrome 58.0.3029.110 (64-bit)

Selenium official to Firefox proxy configuration does not take effect, did not see the appropriate configuration for Chrome Selenium authorities did not inform how to configure, but it is effective in two ways:

  1. Connection no user name password authentication proxy
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument('--proxy-server=http://ip:port') 
driver = webdriver.Chrome(chrome_options=chromeOptions)
  1. Connection user name and password
from selenium import webdriverdef create_proxyauth_extension(proxy_host, proxy_port,
                proxy_username, proxy_password,
                scheme='http', plugin_path=None):
  """Proxy Auth Extension
 
  args:
    proxy_host (str): domain or ip address, ie proxy.domain.com
    proxy_port (int): port
    proxy_username (str): auth username
    proxy_password (str): auth password
  kwargs:
    scheme (str): proxy scheme, default http
    plugin_path (str): absolute path of the extension    
 
  return str -> plugin_path
  """
  import string
  import zipfile
 
  if plugin_path is None:
    plugin_path = 'd:/webdriver/vimm_chrome_proxyauth_plugin.zip'
 
  manifest_json = """
  {
    "version": "1.0.0",
    "manifest_version": 2,
    "name": "Chrome Proxy",
    "permissions": [
      "proxy",
      "tabs",
      "unlimitedStorage",
      "storage",
      "<all_urls>",
      "webRequest",
      "webRequestBlocking"
    ],
    "background": {
      "scripts": ["background.js"]
    },
    "minimum_chrome_version":"22.0.0"
  }
  """
 
  background_js = string.Template(
  """
  var config = {
      mode: "fixed_servers",
      rules: {
       singleProxy: {
        scheme: "${scheme}",
        host: "${host}",
        port: parseInt(${port})
       },
       bypassList: ["foobar.com"]
      }
     };
 
  chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
 
  function callbackFn(details) {
    return {
      authCredentials: {
        username: "${username}",
        password: "${password}"
      }
    };
  }
 
  chrome.webRequest.onAuthRequired.addListener(
        callbackFn,
        {urls: ["<all_urls>"]},
        ['blocking']
  );
  """
  ).substitute(
    host=proxy_host,
    port=proxy_port,
    username=proxy_username,
    password=proxy_password,
    scheme=scheme,
  )
  with zipfile.ZipFile(plugin_path, 'w') as zp:
    zp.writestr("manifest.json", manifest_json)
    zp.writestr("background.js", background_js)
 
  return plugin_path
 
proxyauth_plugin_path = create_proxyauth_extension(
  proxy_host="proxy.crawlera.com",
  proxy_port=8010,
  proxy_username="fea687a8b2d448d5a5925ef1dca2ebe9",
  proxy_password=""
)
 
 
co = webdriver.ChromeOptions()
co.add_argument("--start-maximized")
co.add_extension(proxyauth_plugin_path)
 
 
driver = webdriver.Chrome(chrome_options=co)
driver.get(<a href="http://www.amazon.com/" rel="external nofollow">http://www.amazon.com/</a>)

I write to you, for everyone to recommend a very wide python learning resource gathering, click to enter , there is a senior programmer before learning to share experiences, study notes, there is a chance of business experience, and for everyone to carefully organize a python zero the basis of the actual project data, daily python to you on the latest technology, prospects, learning to leave a message of small details

Published 46 original articles · won praise 30 · views 70000 +

Guess you like

Origin blog.csdn.net/haoxun09/article/details/104828272