Python introductory example: Get real reviews of tourist attractions - Code World

Python introductory example: Get real reviews of tourist attractions

Enterprise 2023-10-05 16:49:28 views: null

Preface

TripAdvisor is a travel review website. If you want to crawl data from this website, you need to understand the website's access rules and crawling restrictions.

Code

For the TripAdvisor website, you can use Python's third-party library Seleniumto simulate browser behavior and simulate user operations on the website to obtain data. The following is a simple implementation process:

1. Install the necessary libraries: Selenium and BeautifulSoup

pip install selenium beautifulsoup4

2. Download the webdriver corresponding to the browser and install it into the system

# 以Chrome浏览器调用为例
# 下载对应管理器
from selenium import webdriver
driver_path = "/path/to/chromedriver"
options=webdriver.ChromeOptions()
options.add_argument('--no-sandbox') # 以root模式下不是必须的，非root模式下才有必要
browser = webdriver.Chrome(executable_path=driver_path, options=options)

3. Send an HTTP request to obtain the target page data

url = "https://www.tripadvisor.cn/Attractions-g186338-Activities-London_England.html#FILTERED_LIST"
browser.get(url)
html = browser.page_source
soup = BeautifulSoup(html, "html.parser")

4. Parse the HTML page and obtain the required data

results = []
for element in soup.find_all("div", class_="listItem"):
    name = element.find("div", class_="listing_title").text
    rating = element.find("span", class_="ui_bubble_rating")['class'][1][1]
    review_count = element.find("a", class_="review_count").text.split(" ")[0]
    results.append((name, rating, review_count))

5. Collect data and save it for later processing and analysis

df = pd.DataFrame(results, columns=["name", "rating", "review_count"])
df.to_csv("tripadvisor_data.csv", index=False)

Please note that the specific crawling process may change as the website changes. Please perform specific analysis and processing yourself. I just provide a simple implementation process for reference.

Guess you like

Origin blog.csdn.net/m0_48405781/article/details/131069146

Python introductory example: Get real reviews of tourist attractions

Python introductory example: Get real reviews of tourist attractions

Python introductory example: Get real reviews of tourist attractions

Python crawler tourist attractions

Analysis of tourist attractions in Yunnan based on Python

Sichuan tourist attractions recommendation system based on PYTHON django

2.Python data analysis project - ticket price prediction of tourist attractions

[Crawler] Tourist reviews on travel websites

[POI2007] Tourist attractions atr

Design and implementation of tourist attractions implemented with Nodejs

Where to go on New Year’s Day? Python analyzes popular tourist cities and which attractions are more cost-effective

Design and implementation of large-screen full-screen system for data visualization of tourist attractions in Hubei and Wuhan using python (django framework)

BZOJ1097: [POI2007] Tourist attractions atr

pyecharts realizes data visualization of popular tourist attractions across the country

Get movie reviews + data visualization | Python+requests+re+WordCloud

BZOJ 1097: [POI2007] tourist attractions atr state compression + Dijkstra

A question-answering assistance system for tourist attractions based on neo4j knowledge graph

Recommendation of tourist attractions in r language deep learning: Naive Bayesian algorithm implementation

Python crawler introductory example six automatic query of IP address attribution

XStream introductory example

React introductory example tutorial

ElasticSearch Introductory Course Example

mycat introductory example

Hadoop MapReduce Introductory Example

Python network request GET and POST example

How to tourist attractions, Festival of Lights, food festival and other public number of soft paper do not micro-letters being intercepted?

Reptile combat: Reptile plus data analysis, Chongqing Electric Brother's article will take you to analyze all the tourist attractions in Chongqing

Across time and space, immerse yourself: 3D display technology allows you to visit global tourist attractions at home

Python: Get the visitor's real IP

[Python] Get real-time futures data

Recommended

The “35-year-old curse” of Chinese coders

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Ranking

Methods and Practices of Manufacturing Data Quality Improvement

Serial port upgrade program for efficient transmission of large firmware

Use KITTI to run LIOSAM and complete the EVO evaluation

Calling methods on string literals (Java)

ActiveMQ and Spring integration (4)

You have to know the HTTPS! ! !

PAT whether Structures and Algorithms 7-4 with a binary search tree (40 rows streamlined structure)

el-upload brings request background

UVA - 1204 Fun Game-like pressure dp

2019-12-28

Daily

More

2024-05-04(18)

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)

2024-04-27(29)

2024-04-26(22)

2024-04-25(32)