[Selenium+Pyecharts] Crawl Sydney rental information, clean the data and visualize it

1.Main purpose:

For international students who have just arrived in Sydney, housing prices are not transparent and it is easy to be defrauded by unscrupulous agents or individuals (I still remember about 6 years ago, when I rented a house with my brother Shi in Sydney for the first time, it cost 930 dollars a week for one room and one bathroom. I was deceived. ), so this app may be able to quickly help international students understand the average price of different house types in a certain area of ​​Sydney, as well as how many of each type of house are being rented.
Today, Australia's borders are closed and the demand for renting is getting lower and lower. How to discuss new rents with landlords or agents in a reasonable and well-founded way? These data may help you.

2. Function implementation:

1 Users can independently query the area they want to know about

2 Crawler today’s latest rental information in this area of ​​Sydney, automatic cleaning of data and automatic visualization

The reason for choosing Sydney Today as the crawler is that as Australia's largest Chinese platform, most Chinese landlords and tenants rely on Sydney Today, and the various decorations of Chinese landlords' houses are favored by Chinese international students.

3. Code display:

The remark was added when I was writing the blog. If there is any specific line of code that I am not clear about, please leave a message for discussion.

from selenium import webdriver
import pandas as pd
from selenium.webdriver.common.keys import Keys
import time
from pyecharts.charts import Pie,Bar
from pyecharts import options as opts
from pyecharts.globals import ThemeType

#获取用户想要查询的信息
distinct = str(input("请输入您要查询的区域: ").lower())
times = int(input("请输入您要加载的次数: "))

class SydneyTodayRent():
	#初始化信息
    def __init__(self):
        self.url = 'https://www.sydneytoday.com/house_rent'
        self.wd = webdriver.Chrome()
        self.wd.implicitly_wait(10)
        self.distinct = distinct
        self.distinct_1 = distinct[:-1]
        self.times = times
	#打开今日悉尼
    def open(self):
        self.wd.get(self.url)
	#进行爬虫
    def input(self):
        element = self.wd.find_element_by_id('autocomplete_suburb')
        element.send_keys(self.distinct_1)
        time.sleep(2)
        element.send_keys(Keys.DOWN)
        time.sleep(1)
        element.send_keys(Keys.ENTER)
        time.sleep(1)
        for i in range(int(self.times)):
            print(f'正在进行第{i + 1}次加载')
            self.wd.find_element_by_css_selector('.btn.btn-default.btn-lg.mtg-loadmore').send_keys(Keys.ENTER)

Guess you like

Origin blog.csdn.net/weixin_52589734/article/details/113569032