Dataset notes: GeoLife GPS data (user guide)

Data link: https://www.microsoft.com/en-us/download/details.aspx?id=52367

1 Basic data information

1.1 Data introduction

  • Collected from 182 users over three years (from April 2007 to August 2012) in the (Microsoft Research Asia) Geolife project.
  • The GPS tracks for this dataset consist of a series of timestamped points, each containing information about latitude, longitude, and altitude.
  • The dataset contains 17,621 trajectories with a total distance of approximately 1.2 million kilometers and a total duration of 50,176 hours
  • These trajectories were recorded by different GPS loggers and GPS phones, and with various different sampling rates
  • 91% of trajectories were recorded in a dense representation, such as a point every 15 seconds or every 510 meters

  • This dataset records a wide range of users’ outdoor activities, including not only daily routines such as going home and going to work, but also recreational and sports activities such as shopping, sightseeing, dining, hiking, and cycling.

1.2 Distribution of data

  • Although this dataset is widely distributed in more than 30 cities in China and some cities in the United States and Europe, the majority of the data was generated in Beijing, China.
  • Figure 1 shows the distribution of this dataset in Beijing (heat map). The number to the right of the heat bar indicates the number of points generated at a certain location.

1.3 Trajectory distance and duration

  • The distributions of distance and duration of trajectories are presented in Figures 2 and 3 respectively.

1.4 Number of GPS data per user

  • During the data collection process, some users carried GPS loggers for years, while others had only a few weeks of track data sets.
  • This distribution is presented in Figure 4, while the distribution of the number of collected trajectories per user is shown in Figure 5.

2 The novelty of data—travel methods

  • 73 users have tagged their trajectories with transportation modes, such as driving, taking the bus, bicycling, and walking.
  • While this only covers a portion of the dataset, the scale can still support travel mode learning.

3 data format

3.1 Trajectory data

3.2 Access mode label

Travel modes include: walk , bike , bus , car , subway, train , airplane , boat , run and motorcycle .

4 data format

There are 182 folders in total, each representing all the tracks of a user

Each PLT file contains a separate trajectory named after its start time. To avoid possible confusion with time zones, we use Greenwich Mean Time (GMT) in the date/time properties of each point 

Guess you like

Origin blog.csdn.net/qq_40206371/article/details/132720965