Transportation Data Technology Assignment 5 - Cluster Analysis Based on Shared Bicycle Travel Data

data set

The data set contains shared bicycle trip data, where each record represents a shared bicycle trip. Here are the columns in the dataset and their meanings:

1.'bike ID': The unique identifier of the shared bicycle.
2.'otime': Departure time, indicating the start time of the shared bicycle trip.
3.'olgt': Longitude of point O, indicating the longitude coordinate of the starting point.
4.'olat': O point latitude, indicating the latitude coordinate of the starting point.
5.'dlgt': Longitude of point D, indicating the longitude coordinates of the destination.
6.'dlat': latitude of point D, indicating the latitude coordinates of the destination.
7.'time': travel time, indicating the duration of the shared bicycle trip.

Each record represents relevant information about a shared bicycle trip, including departure time, latitude and longitude of the starting point, longitude and latitude of the destination, and travel time. This information can be used to perform cluster analysis to discover underlying patterns and structures in shared bike trip data.

Data preprocessing

First, based on the latitude and longitude coordinates of the starting point (point O) and destination (point D) of each travel record, the estimated travel distance (unit: km) was calculated using the Haversine formula. This distance is calculated by calculating the spherical distance between two points on the Earth, taking into account the curvature of the Earth.

Insert image description here

Secondly, based on the travel time and estimated travel distance, the shared bicycle riding speed (unit: km/h) was calculated. Travel time is obtained by converting a time string to a time type and converting it to a numeric value in seconds. Riding speed is calculated by dividing the estimated trip distance by the trip time and multiplying by 3600 to convert to hours.

Finally, there are reasonable limits based on riding speedÿ

Guess you like

Origin blog.csdn.net/weixin_54707168/article/details/132661164