python reptile learning: Browser Log

The last chapter learned with  python simulated landing Baidu, which  get and  post the many steps, mistakes can lead to failure of the last landing in any aspect, so this for the novice with some reluctance. This chapter describes a simple way of landing to get after landing  cookie .

In the  python middle there is a very powerful library  selenium , the library can call the browser, use the browser to log on Baidu, mention out  cookie , so next time use this landing  cookie will be able to achieve the effect of landing. And, with  selenium what tubes do not need to simulate a browser  get ah  post ah, if under conditions permitting, the authors still like to use  selenium . In use  selenium before the need to do some preparation work:

  1. Download extracting installer selenium-2.45.0.tar.gz
  2. Download the browser,  Chrome the version  47.0.2526.106 m , firefox the version 32.0.1
  3. Download the browser plug-in to support  python the author's  Python34 is installed in the  C drive, so the plug-in  C:\Python34\ below, Chrome is  chromedriver.exe ,  firefox as firefoxdriver.exe

Decompression finished  selenium-2.45.0.tar.gz , the find  setup.py , run  python setup.py install wait for the installation to complete. Why here goes higher version  selenium ? Because of the high version  selenium does not support low version of the browser, and high version of the browser plug-ins seemingly not updated with time there will be unexpected  bug , so here uses a lower version  selenium . Remember to close your browser after installing updated browser, Chrome turn off update methods are:

  1. C:\Program Files (x86)\Google\ Under the direct delete  update folders
  2. Right Computer -> Manage -> Services -> Disable  Google Update Service

firefox Similarly, direct delete the upgrade  exe , and then prohibit off update service. Incidentally, you can find a setting in your browser, the update also stop set inside out.

Finally, the browser plug into the  Python34 directory, try the following code

'''
遇到python不懂的问题,可以加Python学习交流群:1004391443一起学习交流,群文件还有零基础入门的学习资料
'''
from selenium import webdriver

# browser = webdriver.Chrome()
browser = webdriver.Firefox()
browser.get("https://www.baidu.com")

If you can successfully open the browser, then prove that the installation was successful! As shown below:

selenium The front end of the page may be  id ,  class ,  tag name positioning the like. Open the Baidu home page,  F12 select  Click to view elements on the page  , move the pointer into the search bar, click on the button:

After clicking the search box to get the search box  html Code:

<input id="kw" class="s_ipt" autocomplete="off" maxlength="255" value="" name="wd">

Similarly Place the pointer  to Baidu  button above to obtain a look at Baidu's  html Code:

<input id="su" class="bg s_btn" type="submit" value="百度一下">

You can see the search box  id Shi  kw , Baidu, the  id Shi  su , and use  selenium can be targeted to the search box, and fill in things need to search in the search box, click on the final realization of Baidu, the code is:

url = "https://www.baidu.com/"

browser = webdriver.Chrome()
# browser = webdriver.Firefox()
browser.get(url)

# 清空搜索框
browser.find_element_by_id("kw").clear()

# 通过id方式定位
browser.find_element_by_id("kw").send_keys("TTyb")

# 点击“百度一下”
browser.find_element_by_id("su").click()

You can also  name ,  class name ,  CSS ,  xpath positioning:

# 通过name方式定位
browser.find_element_by_name("wd").send_keys("TTyb")
# 通过class name 方式定位
# browser.find_element_by_class_name("s_ipt").send_keys("TTyb")
# 通过CSS方式定位
# browser.find_element_by_css_selector("#kw").send_keys("TTyb")
# 通过xpath方式定位
# browser.find_element_by_xpath("//input[@id='kw']").send_keys("TTyb")

# 点击“百度一下”
browser.find_element_by_id("su").click()

Is not it amazing? The use  selenium to log Baidu will be more simple, completely without the tube  postdata , you need only:

  1. Open the landing page https://passport.baidu.com/v2/?login
  2. Find  account  ,  password  ,  login  of html
  3. Fill in the account password, log in directly

In Baidu landing page, the account  html is:

<input id="TANGRAM__PSP_3__userName" class="pass-text-input pass-text-input-userName pass-text-input-hover" type="text" autocomplete="off" name="userName" placeholder="手机/邮箱/用户名">

The password  html is:

<input id="TANGRAM__PSP_3__password" class="pass-text-input pass-text-input-password" type="password" name="password" placeholder="密码" autocomplete="off">

Landing  html is:

<input id="TANGRAM__PSP_3__submit" class="pass-button pass-button-submit" type="submit" value="登录">

You can clearly know the account  id is  TANGRAM__PSP_3__userName , the password  id is  TANGRAM__PSP_3__password , the landing  id was  TANGRAM__PSP_3__submit achieved using the code:

# 清空账号输入框
browser.find_element_by_id("TANGRAM__PSP_3__userName").clear()
# 通过id方式定位账号
browser.find_element_by_id("TANGRAM__PSP_3__userName").send_keys("username")

# 清空密码输入框
browser.find_element_by_id("TANGRAM__PSP_3__password").clear()
# 通过id方式定位密码
browser.find_element_by_id("TANGRAM__PSP_3__password").send_keys("password")

# 点击“登陆”
browser.find_element_by_id("TANGRAM__PSP_3__submit").click()

Event of code how to do? Positioning the verification code  html , a verification code manually into:

verifycode = input("验证码是:")
# 填写验证码
browser.find_element_by_id("验证码id").send_keys(verifycode)
# 点击“登陆”
browser.find_element_by_id("TANGRAM__PSP_3__submit").click()

After landing successfully, you can get after login  cookie :

cookie = [item["name"] + "=" + item["value"] for item in browser.get_cookies()]

Get  cookie is an array, the array will be converted to  dict , the preservation of local  json , can be called directly next landing  requests landing:

dict = {}
for item in cookie:
    itm = item.split("=")
    dict[itm[0]] = itm[1]

import json
file = open("cookie.json","w")
file.write(json.dumps(dict))
file.close()

Finally, remember to close your browser browser.quit()

Guess you like

Origin blog.csdn.net/qq_40925239/article/details/90674169