" Automatic browser operation-interfaceless selenium crawler "
I have learned how to use selenium to drive the browser before and do operations
For details, please see: selenium automated operation browser
But selenium will always open a browser. This time we will look at a way to automate operations without opening the browser.
Note: This operation must be based on the successful establishment of the previous environment, so you must first understand the automatic operation of the browser.
01, Get web content without interface
In fact, the interfaceless operation is just to add a piece of code on the basis of the interface operation, but the overall look is a little noble. In addition, the use of interfaceless operation is a bit more in terms of crawling. The desired element can be directly obtained through various positioning.
Add an important line of code on the basis of the interface operation browser, and here you need to use a plug-in:
driver=webdriver.PhantomJS("phantomjs插件路径")
You can use it to get the interface information:
from selenium import webdriver
driver=webdriver.PhantomJS()
#url打开的要打开的网址
driver.get(url="http://www.baidu.com")
driver.page_source()
Then the operation is the same as before, but there is one more code.
02—selenium extract content
The interfaceless operation to get page information is done, then how to get the specified content!
The operation is still the same as the previous operation, screenshot, positioning, these are the same.
To get the specified content you want, you only need to do this:
According to the previous positioning method: then add
.text()
Let's look at an example specifically:
from selenium import webdriver
driver=webdriver.PhantomJS()
#url打开的要打开的网址
driver.get(url="http://www.baidu.com")
driver.page_source()#获取页面html
#通过id定位元素并获取定位的内容
driver.find_element_by_id("su").text()
#获取id为'su'的元素的内容
Is that problem solved?
The official account backstage reply "no interface operation browser" to obtain related plug-ins.
Follow the official account to get more content!
related suggestion:
Automatically open the browser, automatic operation
These pictures are too beautiful, want a good slow a download, how to break online, urgent! ! ! !