Python——selenium爬取学科 - 代码天地

Python——selenium爬取学科

其他 2020-02-27 15:23:31 阅读次数: 0

import time
import pandas as pd
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Chrome()  #驱动谷歌浏览器

wait = WebDriverWait(browser, 3)
try:
    browser.get("https://souky.eol.cn/api/newapi/assess_result")
    wait.until(
        EC.presence_of_element_located((By.XPATH, '/html/body/div[4]/div[1]/ul/li[1]/div')),
    )
except TimeoutException:
    print('Timeout')


def click_according_text(text):
    try:
                button = browser.find_element_by_link_text(text)
                button.click()
    except:
                print(text+'不可点击')
#click_according_text("理学")

a=[1,17]
b=[1,14]
c=[1,36]
d=[1,9]
e=[1,9]
f=[1,5]
g=[1,5]



def click_locatin_element(element, text):
    try:
        button = browser.find_element_by_xpath(element)
        button.click()
    except:
        print(text + "不可点击")

def get_secien(element):
    button = browser.find_element_by_xpath(element)
    text = pd.DataFrame([(button.text)])
    text.to_csv('C:/Users/Administrator/Desktop/学科2014.csv', sep=',', mode='a',header=None,index=None)


    click_locatin_element(element,element)
    data = pd.read_html("https://souky.eol.cn/api/newapi/assess_result")[0]
    data.to_csv('C:/Users/Administrator/Desktop/学科2014.csv', sep=',', mode='a',header=None,index=None)
    time.sleep(3)


    for i in range(1,8):
        k=[18,15,37,10,10,6,6]
        for j in range(1,k[i-1]):
            element = "/html/body/div[4]/div[1]/ul/li["+str(i)+"]/ul/li["+str(j)+"]"
            get_secien(element)

乔眉

发布了55 篇原创文章 · 获赞 17 · 访问量 1万+

私信关注

猜你喜欢

转载自blog.csdn.net/weixin_43213658/article/details/88673290

Python——selenium爬取学科

python 爬取，selenium

python + selenium爬取淘宝

python selenium爬取音频

python+selenium爬取图片

python selenium爬取QQ空间方法

python+selenium爬取动漫图片

Python—selenium爬取快代理

python爬取拉勾网之selenium

python+selenium——爬取网站

Python——selenium爬取斗鱼房间信息

python selenium 爬取领英的数据

使用selenium + chrome爬取中国大学Mooc网的计算机学科的所有课程链接

selenium 爬取拉勾

selenium异步爬取

[原创]Python+selenium+Chrome爬取excel网站

Python Selenium Chrome Headless 爬取企查查数据

Python+selenium实现自动爬取实例

Python+selenium爬取智联招聘的职位信息

python利用selenium爬取X蜂窝热门游记

Python爬虫：Selenium 爬取淘宝实战练习

使用python+selenium爬取京东商品列表

selenium+python爬取数据跳转网页

python+selenium实现动态爬取及selenuim的常用操作

Python + selenium 爬取百度文库Word文本

Python爬虫-爬取豆瓣信息(selenium+xpath)

Python爬虫-爬取斗鱼网页selenium+bs

python 使用selenium和requests爬取页面数据

python使用selenium爬取js加密的网页

python+selenium批量爬取IEEExplore论文

今日推荐

国产云输入法——仅华为无云端数据上传安全问题

开源日报 | 工业开源项目OGG 1.0；姐姐，你要和我一起配置火狐吗；苹果AI遥遥落后？Fedora 40

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

开源日报 | 中学生开源前端动画引擎；全球首个Llama3 8B中文版开源模型；联想电脑恐出局；Linus讽刺AI炒作

“百模大战”必有一战 | 2024中国“百模大战”竞争格局分析

最强开源大模型 Llama 3 上架 Gitee AI

周排行

自媒体文章如何提高原创度以及如何检测原创度

开启qq邮箱的smtp服务

Qt程序单次启动（QSingleApplication类）

国外的外包网站

更新IDEA主题——放飞代码风格

cocos2dx 实现搓牌效果（翻牌效果），包括铺平动画

dict和json之间的互相转换

angular的一些思考

. Fibonacci数列是这样定义的： F[0] = 0 F[1] = 1 for each i ≥ 2: F[i] = F[i-1] + F[i-2] 因此，Fibonacci数列就形如：0, 1

洛谷P1064 金明的预算方案

每日归档

更多

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)

2024-04-18(0)

2024-04-17(5)

2024-04-16(70)