python爬虫之电影天堂 - 代码天地

python爬虫之电影天堂

其他 2020-04-21 01:41:33 阅读次数: 0

import  requests
from lxml import etree


BASE_URL="https://www.dytt8.net"
url = "https://www.dytt8.net/html/gndy/dyzz/list_23_1.html"
headers1 = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36"
}

html = requests.get(url,headers= headers1)
html = etree.HTML(html.content.decode("gbk"))
# print(etree.tostring(html,encoding="gbk").decode("gbk"))
ul = html.xpath("//div[@class='co_content8']//ul//a/@href")
ul = map(lambda x:BASE_URL+x,ul)
ul = list(ul)

url2  = ul[0]
result = { }
html = requests.get(url2,headers = headers1)
content = etree.HTML(html.content.decode("gbk"))
title = content.xpath("//div[@class='title_all']//font[@color='#07519a']/text()")[0].strip()
result["title"] = title
other_content = content.xpath("//div[@id='Zoom']//p")[0]
img = other_content.xpath(".//img/@src")[0]
result['img'] = img

text = other_content.xpath(".//text()")
for val in text:
    print(val)

发布了54 篇原创文章 · 获赞 9 · 访问量 1204

私信关注

猜你喜欢

转载自blog.csdn.net/qq_29983883/article/details/105588432

python爬虫之电影天堂

python爬虫（十七）电影天堂爬虫1

爬虫之爬取电影天堂（request）

[python爬虫]爬取电影天堂连接

[python爬虫之路day5]：实战之电影天堂2019精选电影爬取

电影天堂爬虫

爬虫电影天堂

电影天堂小爬虫

利用python爬虫(案例1)--电影天堂的小电影们

电影天堂python脚本

python爬虫——爬取电影天堂磁力链接

爬虫_电影天堂热映电影（xpath）

爬虫爬取电影天堂电影链接

网络爬虫（四）电影天堂电影下载

【爬虫】电影天堂最新电影+小程序

初次用python写一个简单爬虫-获取电影天堂电影列表

python--dytt(电影天堂)

Python爬取电影天堂

python 爬取电影天堂电影

python 爬取电影天堂电影续编

Python获取电影天堂各版块电影

python爬虫获取电影天堂中电影的标题与下载地址，并用正则表达匹配电影类型

电影天堂小爬虫(xpath练习)

Scrapy爬虫爬取电影天堂

初试python爬虫之：豆瓣电影爬虫

电影天堂

XPath之电影天堂数据爬取

python3 电影天堂抓取

Python实现爬取电影天堂最新电影资源

python3 爬取电影天堂最新电影

今日推荐

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

开源日报 | 工业开源项目OGG 1.0；姐姐，你要和我一起配置火狐吗；苹果AI遥遥落后？Fedora 40

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

开源日报 | 中学生开源前端动画引擎；全球首个Llama3 8B中文版开源模型；联想电脑恐出局；Linus讽刺AI炒作

“百模大战”必有一战 | 2024中国“百模大战”竞争格局分析

周排行

Family Tree 题解

BZOJ 1093 最大半连通子图 SCC + DP

幂等处理

Spring----学习（2）----XML 配置Bean 自动装配

SQL Server 远程更新目标表数据

HIbernate3.6 环境搭建

特殊符号正则表达式

【Linux】第一章进程的理解

843. n-皇后问题（dfs+输出各种情况）

空间数据库2

每日归档

更多

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)

2024-04-18(0)

2024-04-17(5)