python selenium 获取动态网页数据 - 代码天地

python selenium 获取动态网页数据

其他 2019-01-09 23:50:01 阅读次数: 0

# -*- coding:utf-8 -*-

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import json
import sys
reload(sys)
sys.setdefaultencoding("utf-8")
curpath=sys.path[0]
print curpath

def getData(url):
    driver=webdriver.Chrome()
    driver.set_page_load_timeout(40)
    time.sleep(3)
    html=driver.get(url[0])
    for page in range(1):
        html=driver.page_source
        soup=BeautifulSoup(html,'lxml')
        table=soup.find('div',class_="unit_loan_prj_detail")
        name=[]
        for th in table.find_all('span',class_="prolist_info_title"):
            name.append(th.get_text())
        i=0
        for tr in table.find_all('span',class_="prolist_info_detail"):
            dic={}
            value=tr.get_text()
            if value is not None:
                dic[name[i]]=value
            else:
                for td in tr.find_all('span'):
                    dic[name[i]]=td.get_text()
            i+=1
            jsonDump(dic,url[1])

def jsonDump(_json,name):
    with open(curpath+'/'+name+'.json','a')as outfile:
        json.dump(_json,outfile,ensure_ascii=False)
    with open(curpath+'/'+name+'.json','a')as outfile:
        outfile.write(',\n')

if __name__=='__main__':
    url=['http://www.powerec.net/gdwz-web/html/xjxx/inquiry_detail.html?inq_h_id=ZGFmNTM2ZjctOWFlYi00ZDEyLWEyZjItNDFjNjAxYmY4MTZj','test']
    getData(url)

猜你喜欢

转载自blog.csdn.net/u012406790/article/details/75115536

python selenium 获取动态网页数据

Python3+Selenium爬取动态网页数据

selenium抓取动态网页数据

Windows下利用python+selenium+firefox爬取动态网页数据(爬取东方财富网指数行情数据)

selenium+chromedriver获取动态网页数据以及模拟鼠标操作后才能获得的数据

如何使用 Python 爬虫抓取动态网页数据

利用selenium并使用gevent爬取动态网页数据

Python:利用 selenium 库抓取动态网页示例

python+selenium爬虫抓取动态网页

python爬虫基础（11：动态网页之使用selenium）

Python3 Selenium+ChromeDriver抓取动态网页

python Selenium动态网页信息爬取

Python爬虫使用selenium处理动态网页

【Python爬虫系列教程 26-100】小姐姐教大家通过Selenium获取ajax数据的方式，从此动态网页并不可怕

Python网络爬虫逆向分析爬取动态网页、使用Selenium库爬取动态网页、编辑将数据存储入MongoDB数据库

Python爬虫4.2 — ajax(动态网页数据抓取)用法教程

爬虫--python3.6+selenium+BeautifulSoup实现动态网页的数据抓取，适用于对抓取频率不高的情况

selenium获取动态网站数据

Python入门基础讲解（十二）：初探selenium——动态网页&静态网页

python 爬虫 selenuim获取动态网页

Python怎么爬取动态网页——如何使用selenium和PhantomJS

Python 使用selenium+webdriver爬取动态网页内容

使用selenium和python，实现静态、多级、动态网页的信息爬取

第十四周助教总结 python爬取动态网页数据，详解 CA-RNN论文读取 python爬取动态网页数据，详解 CA-RNN论文读取

Python开发爬虫之动态网页抓取篇：爬取博客评论数据——通过Selenium模拟浏览器抓取

python从入门到放弃篇29（selenium库）for循环爬取名人名言动态网页并保存数据到本地计算机上

【笔记】Python3｜爬虫处理网页数据异步加载问题（结合Selenium完成）

selenium + python 获取table数据

selenium动态网页与请求简单学习

Selenium 爬取动态网页

今日推荐

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

周排行

Java基础复习_day13_Collection集合

2018.11.16 c语言学习经验

且看Java内置四大核心函数式接口

小程序云开发中数据库的数据分段和显示图片

python的函数

Web-JS进阶

【干货】C++常用代码积累笔记大全

Spring的ioc操作与 IOC底层原理

构建之法20191121-11 Scrum立会报告+燃尽图 07

Spring boot之Hello World访问404

每日归档

更多

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)