What is the iframe?
<Iframe> tag is an inline frame, that is used in the current HTML page embedded in another document, and all major browsers support the iframe tag.
Simply put, that is, within a page, but also embedded in a page, a page appears to be, but in selenium, the element can not be positioned directly within the iframe
Example:
Print out Netease cloud music - cloud music hot music chart top 10 song song title
We first locate the elements rankings of the top 10 songs
Simply put, why id information using the red box
-
Positioned elements, generally we will first directly targeted to a particular song information, then we find that the song id attribute is a long number, then we need to suspect that the number is randomly generated
-
Verification: Copy the element id attribute values, refreshed interface, re-view of the element id attribute values are consistent with the content just copied down, if not, then verify the hypothesis of Article 1
-
After verifying Article 1, at this time we need to do is to find the top, we can see if there is positioned directly to the property value of the element
-
class attribute value of parent element may be directly and accurately positioned, the element is available
-
Looking upward, id attribute value of the blue box can be seen scrambled content are present, the first reaction we would be unavailable for the attribute value
-
Beyond id = attribute value found song-list-pre-cache may be ordered seen, may be positioned directly
-
At this point, Article 4 and Article 6 can achieve precise positioning, the author here is the habitual use of the id attribute
After finding the elements, we direct line and the code, you can print out the song information
from Selenium Import the webdriver Driver = webdriver.Chrome () driver.implicitly_wait ( 10 ) # fetch chart information driver.get ( ' https://music.163.com/#/discover/toplist?id=3778678 ' ) div driver.find_elements_by_css_selector = ( ' # song-List-pre-Cache Table .m-TR-Rank tbody: Child-Nth (-n + 10) B ' ) # for loop, the elements acquired each song, and prints a text attribute value of the element for One in div: Print (one.get_attribute ( ' title ' )) driver.quit ()
Output:
# During the long wait for the implicit (10s), the output is empty and there is no error, this is why?
Knowledge Point
-
find_elements: Note s, plural; the return of a list, if it is to find the elements empty list is returned, no error message
-
find_element: positioning an element, if within the specified time, the element is not found, it will throw an error
Modify our code:
from Selenium Import the webdriver Driver = webdriver.Chrome () driver.implicitly_wait ( 10 ) # fetch chart information driver.get ( ' https://music.163.com/#/discover/toplist?id=3778678 ' ) , songList driver.find_element_by_id = ( ' Song-List-pre-Cache ' ) div = songList.find_elements_by_css_selector ( ' .m-Table-Rank tbody TR: Child-Nth (-n + 10) B ' ) # for loop, to obtain each song elements, and print out the text element property values for One in div: Print (One.get_attribute('title'))
driver.quit()
Look at our output:
Assi it, really an error, wrong on line 8: songList = driver.find_element_by_id ( 'song-list-pre-cache'); but we obviously can navigate to the element ah in the browser?
Positioning iframe
Positioning is no problem, then we need to consider whether or not iframe in the dirty tricks!
Re-look at our Element Information
From inside the red box, we see the above mentioned id = " Song-List-pre-Cache" parent and child elements, we look forward, blue box, there is a parent element of a #g_iframe
Click on the blue area, let's look at the iframe
We can determine, the song information charts, is written in an iframe framework, how do we operate iframe element in it?
Switching iframe
-
By switching id and a unique id
-
-
driver.switch_to.frame('g_iframe')
-
-
By switching name, and unique name
-
driver.switch_to.frame('contentFrame')
-
-
Without id and name, you'll need to locate the iframe, then switch
-
iframe = driver.find_elements_by_tag_name('iframe')[0]
-
driver.switch_to.frame(iframe)
-
Cut back to the main document
After we cut into the iframe framework, you need to once again return to the main document area to operate, it is necessary to switch back to the main document
-
driver.switch_to_default_content()
Multi-nested iframe
既然主文档可以嵌套 iframe ,那么 iframe 同样可以嵌套 iframe ,那么存在这种多重嵌套我们要怎么处理呢?
<html> <iframe id="frame1"> <iframe id="frame2" / > </iframe> </html>
如果我们需要操作 iframe2 中的元素,我们需要切换 2 次
- 先从主文档切换至 iframe1
-
driver.switch_to.frame('iframe1')
-
- 再从 iframe1 切换至 iframe2
-
driver.switch_to.frame('iframe2')
-
我们切到 iframe2 中操作完之后,需要回到 iframe1 中进行操作,selenium 提供了一个更好的方式,避免了从 iframe2 切到主文档 再切到 iframe1 的复杂
-
从 iframe2 切回至 iframe1
-
driver.switch_to.parent_frame()
-
让我们来完成之前的需求:
from selenium import webdriver driver = webdriver.Chrome() driver.implicitly_wait(10) # 抓取排行榜信息 driver.get('https://music.163.com/#/discover/toplist?id=3778678') # 定位到 iframe 的元素 iframe = driver.find_elements_by_tag_name('iframe')[0] # 切到 iframe 框架内 driver.switch_to.frame(iframe) songList = driver.find_element_by_id('song-list-pre-cache') div = songList.find_elements_by_css_selector('.m-table-rank tbody tr:nth-child(-n+10) b') # 使用for循环,获取到每首歌曲的元素,并打印出该元素的 text 属性值 for one in div: print(one.get_attribute('title')) # 切到 iframe 的上一层,即为主文档 driver.switch_to.parent_frame() # 打印主文档的一段内容 meta = driver.find_element_by_css_selector('meta[name="description"]') print('\n'+meta.get_attribute('content')) driver.quit()
输出结果如下:
心如止水
多想在平庸的生活拥抱你
归去来兮
晚安
我曾
四块五
出山
Monsters
那个女孩
像鱼
网易云音乐是一款专注于发现与分享的音乐产品,依托专业音乐人、DJ、好友推荐及社交功能,为用户打造全新的音乐生活。
参考文档:
selenium切换到iframe:https://www.cnblogs.com/xiaoxiaolvdou/p/9316805.html