python爬虫——selenium+chrome使用代理 - 代码天地

python爬虫——selenium+chrome使用代理

其他 2020-01-15 11:31:01 阅读次数: 0

先看下本文中的知识点：

python selenium库安装
chrome webdirver的下载安装
selenium+chrome使用代理
进阶学习

搭建开发环境：

PS：安装了的同学可以跳过了接着下一步，没安装的同学跟着我的步骤走一遍

安装selenium库

pip install selenium

安装chrome webdirver

这里要注意要配置系统环境，把chrome webdirver放到python路径的Scripts目录下，跟pip在一个目录下。
这里可以教大家一个查看python安装路径的命令

# windows系统，打开cmd
where python
# linux
whereis python

谷歌浏览器

注意谷歌浏览器的版本要>=7.9，因为之前下载的chrome webdirver是7.9版本的。浏览器就自己安装吧。

代码样例

好的，现在咋们的环境都配置好了，写几行代码试下，以请求百度为例

from selenium import webdriver
# 用webdriver的chrome浏览器打开
chrome = webdriver.Chrome()
chrome.get('https://www.baidu.com')
print(chrome.page_source)
chrome.quit() #退出

运行下，先会打开chrome浏览器，然后访问百度，在打印page信息，最后关闭浏览器
在这里插入图片描述

使用代理

使用代理IP去访问就得加一个参数了，代码如下

from selenium import webdriver

chrome_options = webdriver.ChromeOptions()
# 代理IP,由快代理提供
proxy = '60.17.254.157:21222'
# 设置代理
chrome_options.add_argument('--proxy-server=%s' % proxy)
# 注意options的参数用之前定义的chrome_options
chrome = webdriver.Chrome(options=chrome_options)
# 百度查IP
chrome.get('https://www.baidu.com/s?ie=UTF-8&wd=ip')
print(chrome.page_source)
chrome.quit() #退出

运行下，结果如图
在这里插入图片描述

扩展

不想用谷歌浏览器啊，想用火狐怎么办。没问题啊，webdriver也支持火狐。看下webdriver的帮助文档

from selenium import webdriver
help(webdriver)

看下图，不止支持火狐firefox，谷歌chrome，ie，opera等等都支持的。
在这里插入图片描述

进阶学习

kdl_csdn

发布了2 篇原创文章 · 获赞 4 · 访问量 113

私信关注

猜你喜欢

转载自blog.csdn.net/kdl_csdn/article/details/103985282

python爬虫——selenium+chrome使用代理

selenium+chrome爬虫

【爬虫】Selenium+chrome

python爬虫Selenium+chrome介绍

芝麻HTTP：爬虫之设置Selenium+Chrome代理

小白学爬虫-设置Selenium+Chrome代理

Selenium+Chrome认证代理使用说明

芝麻HTTP：设置Selenium+Chrome代理

Python3爬虫三大案例实战分享之Selenium+Chrome/Headless Chrome

总结最近学习python爬虫遇到的问题（selenium+Chrome，urllib，requests）

小白学爬虫-在无GUI的CentOS上使用Selenium+Chrome

selenium+chrome options

爬虫实战——简书文章爬取（selenium+Chrome）

Python爬虫案例50篇-第15篇-使用selenium+Chrome抓取某爱某家北京二手房标题

百度指数、360指数爬虫python版：基于selenium+chrome和图像识别技术

python实战 selenium+chrome玩转12306抢票

芝麻HTTP：在无GUI的CentOS上使用Selenium+Chrome

使用Selenium+Chrome/PhantomJS抓取淘宝“美食”详解

centos 安装Selenium+Chrome

笔记-selenium+chrome headless

linux安装selenium+chrome

python爬虫教程：Selenium chrome配置代理Python版的方法

python爬虫——selenium+firefox使用代理

爬虫实战——房天下新房信息爬取（selenium+Chrome）

【Selenium】Centos6.5环境下使用Selenium+Chrome

使用selenium+chrome能抓取数据，而selenium+phantomjs抓取数据为空？

Linux 下selenium的安装和使用在linux中安装selenium+chrome

pytthon + Selenium+chrome linux 部署

arm 环境下安装selenium+chrome

在linux中安装selenium+chrome

今日推荐

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

开源日报 | 工业开源项目OGG 1.0；姐姐，你要和我一起配置火狐吗；苹果AI遥遥落后？Fedora 40

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

开源日报 | 中学生开源前端动画引擎；全球首个Llama3 8B中文版开源模型；联想电脑恐出局；Linus讽刺AI炒作

周排行

浏览器对同一域名进行请求的最大并发连接数

React Hook之自定义Hook

【转】MyBatis缓存机制

-Java-泛型

自动化测试常用脚本-发送邮件

LeetCode#859: Buddy Strings

java、Python处理字符串

第二篇の博客

Hadoop伪分布式环境安装

SQL Server进阶（十一）临时表、表变量

每日归档

更多

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)

2024-04-18(0)