爬虫03-爬取房源 - 代码天地

爬虫03-爬取房源

其他 2020-10-26 09:16:02 阅读次数: 0

import requests
import parsel

base_url = 'https://nc.lianjia.com/ershoufang/pg1'
headers = {
    
    
    'User-Agent': "User-Agent:Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0"
}

response = requests.get(base_url,headers=headers).text
htmls = parsel.Selector(response)
urls = htmls.xpath('//ul[@class="sellListContent"]/li/a/@href').extract()
for url in urls:
    response = requests.get(url,headers=headers).text
    html = parsel.Selector(response)
    title_main = html.xpath('//div[@class="title-wrapper"]//div[@class="title"]/h1/text()').extract()
    title_sub = html.xpath('//div[@class="title-wrapper"]//div[@class="title"]/div/text()').extract()
    price = html.xpath('//div[@class="price "]/span/text()').extract()
    label = html.xpath('//div[@class="communityName"]/a[@class="info "]/text()').extract()
    areaName = html.xpath('//div[@class="areaName"]/span[@class="info"]/a/text()').extract()
    print(title_main)
    with open(r"C:\Users\Administrator\Desktop\03\lianjia.csv", 'a', encoding='utf-8')as f:
        f.write("{},{},{},{},{}\n".format(title_main, title_sub, price, label, areaName))

猜你喜欢

转载自blog.csdn.net/qq_41458842/article/details/106223739

爬虫03-爬取房源

实用爬虫-03-爬取视频教程课程名+链接+下载图片

python爬虫-selenium爬取链家网房源信息

Python爬虫：爬取淮安出租房源信息56页1111套

Python爬虫：爬取我爱我家网二手房源信息

python爬虫—使用bs4爬取链家网的房源信息

python爬虫之爬取链家658家二手房源

python爬虫练习3：通过python爬取二手房源信息

Python爬虫2------(爬取房源信息实现多页面获取)

python学习：爬取房源信息

Python爬取链家成都二手房源信息 asyncio + aiohttp 异步爬虫实战

Python爬虫实战，requests+parsel模块，爬取二手房房源信息数据

03 Python爬虫之Requests网络爬取实战

爬虫03_基于requests的分页数据的爬取

Python 爬虫学习03 具体爬取网页的实现

我爱我家房源信息爬取

scrapy爬取爱上租网站的房源信息（一）

上海安居客房源信息采集与爬取

如何爬取链家网页房源信息

江寓租房挂牌房源信息爬取

江寓租房挂牌房源信息爬取

建方公寓挂牌房源信息爬取

优客逸家挂牌房源爬取

青客公寓挂牌房源分城市爬取

青客公寓挂牌房源分城市爬取

Python 使用selenium爬取方天下，房源评论信息

python实战 | 爬取贝壳房源总数据价格提取

Python爬虫教程-03-使用chardet

爬虫03-京东数据采集

python爬虫03-超时请求

今日推荐

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

Java自定义时间格式

同步整形电路

在开发中最最最常用的字符串的属性大集合

Linux 查看端口占用并杀掉

Java基础四：ArrayList

多线程之死锁就是这么简单

mysql 基础命令集

awk 命令详解

Centos6.3编译安装nginx+php步骤

OCR （Optical Character Recognition，光学字符识别）

每日归档

更多

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)