The last chapter learned with python
simulated landing Baidu, which get
and post
the many steps, mistakes can lead to failure of the last landing in any aspect, so this for the novice with some reluctance. This chapter describes a simple way of landing to get after landing cookie
.
In the python
middle there is a very powerful library selenium
, the library can call the browser, use the browser to log on Baidu, mention out cookie
, so next time use this landing cookie
will be able to achieve the effect of landing. And, with selenium
what tubes do not need to simulate a browser get
ah post
ah, if under conditions permitting, the authors still like to use selenium
. In use selenium
before the need to do some preparation work:
- Download extracting installer
selenium-2.45.0.tar.gz
- Download the browser,
Chrome
the version47.0.2526.106 m
,firefox
the version32.0.1
- Download the browser plug-in to support
python
the author'sPython34
is installed in theC
drive, so the plug-inC:\Python34\
below,Chrome
ischromedriver.exe
,firefox
asfirefoxdriver.exe
Decompression finished selenium-2.45.0.tar.gz
, the find setup.py
, run python setup.py install
wait for the installation to complete. Why here goes higher version selenium
? Because of the high version selenium
does not support low version of the browser, and high version of the browser plug-ins seemingly not updated with time there will be unexpected bug
, so here uses a lower version selenium
. Remember to close your browser after installing updated browser, Chrome
turn off update methods are:
C:\Program Files (x86)\Google\
Under the direct deleteupdate
folders- Right Computer -> Manage -> Services -> Disable
firefox
Similarly, direct delete the upgrade exe
, and then prohibit off update service. Incidentally, you can find a setting in your browser, the update also stop set inside out.
Finally, the browser plug into the Python34
directory, try the following code
'''
遇到python不懂的问题,可以加Python学习交流群:1004391443一起学习交流,群文件还有零基础入门的学习资料
'''
from selenium import webdriver
# browser = webdriver.Chrome()
browser = webdriver.Firefox()
browser.get("https://www.baidu.com")
If you can successfully open the browser, then prove that the installation was successful! As shown below:
selenium
The front end of the page may be id
, class
, tag name
positioning the like. Open the Baidu home page, F12
select Click to view elements on the page , move the pointer into the search bar, click on the button:
After clicking the search box to get the search box html
Code:
<input id="kw" class="s_ipt" autocomplete="off" maxlength="255" value="" name="wd">
Similarly Place the pointer to Baidu button above to obtain a look at Baidu's html
Code:
<input id="su" class="bg s_btn" type="submit" value="百度一下">
You can see the search box id
Shi kw
, Baidu, the id
Shi su
, and use selenium
can be targeted to the search box, and fill in things need to search in the search box, click on the final realization of Baidu, the code is:
url = "https://www.baidu.com/"
browser = webdriver.Chrome()
# browser = webdriver.Firefox()
browser.get(url)
# 清空搜索框
browser.find_element_by_id("kw").clear()
# 通过id方式定位
browser.find_element_by_id("kw").send_keys("TTyb")
# 点击“百度一下”
browser.find_element_by_id("su").click()
You can also name
, class name
, CSS
, xpath
positioning:
# 通过name方式定位
browser.find_element_by_name("wd").send_keys("TTyb")
# 通过class name 方式定位
# browser.find_element_by_class_name("s_ipt").send_keys("TTyb")
# 通过CSS方式定位
# browser.find_element_by_css_selector("#kw").send_keys("TTyb")
# 通过xpath方式定位
# browser.find_element_by_xpath("//input[@id='kw']").send_keys("TTyb")
# 点击“百度一下”
browser.find_element_by_id("su").click()
Is not it amazing? The use selenium
to log Baidu will be more simple, completely without the tube postdata
, you need only:
- Open the landing page
https://passport.baidu.com/v2/?login
- Find account , password , login of
html
- Fill in the account password, log in directly
In Baidu landing page, the account html
is:
<input id="TANGRAM__PSP_3__userName" class="pass-text-input pass-text-input-userName pass-text-input-hover" type="text" autocomplete="off" name="userName" placeholder="手机/邮箱/用户名">
The password html
is:
<input id="TANGRAM__PSP_3__password" class="pass-text-input pass-text-input-password" type="password" name="password" placeholder="密码" autocomplete="off">
Landing html
is:
<input id="TANGRAM__PSP_3__submit" class="pass-button pass-button-submit" type="submit" value="登录">
You can clearly know the account id
is TANGRAM__PSP_3__userName
, the password id
is TANGRAM__PSP_3__password
, the landing id
was TANGRAM__PSP_3__submit
achieved using the code:
# 清空账号输入框
browser.find_element_by_id("TANGRAM__PSP_3__userName").clear()
# 通过id方式定位账号
browser.find_element_by_id("TANGRAM__PSP_3__userName").send_keys("username")
# 清空密码输入框
browser.find_element_by_id("TANGRAM__PSP_3__password").clear()
# 通过id方式定位密码
browser.find_element_by_id("TANGRAM__PSP_3__password").send_keys("password")
# 点击“登陆”
browser.find_element_by_id("TANGRAM__PSP_3__submit").click()
Event of code how to do? Positioning the verification code html
, a verification code manually into:
verifycode = input("验证码是:")
# 填写验证码
browser.find_element_by_id("验证码id").send_keys(verifycode)
# 点击“登陆”
browser.find_element_by_id("TANGRAM__PSP_3__submit").click()
After landing successfully, you can get after login cookie
:
cookie = [item["name"] + "=" + item["value"] for item in browser.get_cookies()]
Get cookie
is an array, the array will be converted to dict
, the preservation of local json
, can be called directly next landing requests
landing:
dict = {}
for item in cookie:
itm = item.split("=")
dict[itm[0]] = itm[1]
import json
file = open("cookie.json","w")
file.write(json.dumps(dict))
file.close()
Finally, remember to close your browser browser.quit()