Time series forecasting — forecasting data sets (load, wind power, photovoltaic, sales, etc.)

Table of contents

Dataset 1: GEFCom2014 load data

Dataset 2: iQiyi User Retention Prediction Challenge Dataset

Dataset 3: Power Transformer Dataset (ETDataset)

Dataset 4: 2016 Electrician Mathematical Modeling Competition Load Forecasting Dataset

Data set 5: Wind turbine operation data set

Dataset 6: Australian electricity load and price forecast data

Dataset 7: Changzhou Bridgestone Photovoltaic Dataset

Data set 8: Xinjiang photovoltaic wind power data set



 

Dataset 1: GEFCom2014 load data

Data set download:

Dataset introduction:

GEFCom2014 "Load Forecast Data" is the public data set of the competition, and the load forecast trajectory of GEFCom2014 is probabilistic load forecast. The visualization of the dataset is as follows:

Dataset 2: iQiyi User Retention Prediction Challenge Dataset

Data set download:

Competition title description:

iQiyi is the leading high-quality video entertainment streaming platform in China and the world. More than 500 million users enjoy entertainment services on iQiyi every month. iQiyi adheres to the brand slogan of "Enjoy Quality" and creates a professional and genuine video content library covering movies, TV series, variety shows, and animations, as well as massive user-generated content such as "Sui Ke" to provide users with a rich professional video experience .

The iQiyi mobile APP uses the latest AI technologies such as deep learning to enhance users’ personalized product experience and better allow users to enjoy customized entertainment services. We use the key indicator "N-day retention points" to measure user satisfaction. For example, if a user's "7-day retention score" on October 1st is equal to 3, it means that the user will visit the iQiyi APP on 3 days in the next 7 days (October 2nd to 8th). Predicting a user's retention score is a challenging problem: different users have very different preferences and activity levels. In addition, other factors such as the entertainment time at the user's disposal and the popularity of popular content also have strong cyclical characteristics.

This competition is based on the data information after desensitization and sampling of iQiyi APP to predict the user's 7-day retention score. Participating teams need to design corresponding algorithms for data analysis and prediction.

Data description:

This competition provides a rich data set, including video data, user portrait data, user startup logs, user viewing and interactive behavior logs, etc. For users in the test set, it is necessary to predict the "7-day retention score" of each user on a certain day. The 7-day retention score ranges from 0 to 7, and the prediction results are retained to 2 decimal places.

User portrait data

Field name

Description

user_id

device_type

iOS, Android

device_rom

rom of the device

device_ram

ram of the device

sex

age

education

occupation_status

territory_code

 App launch logs

Field name

Description

user_id

date

Desensitization, started from 0

launch_type

spontaneous or launched by other apps & deep-links

Video related data

Field name

Description

item_id

id of the video

father_id

album id, if the video is an episode of an album collection

cast

a list of actors/actresses

duration

video length

tag_list

a list of tags

User playback data

Field name

Description

user_id

item_id

playtime

video playback time

date

timestamp of the behavior

User interaction data

Field name

Description

user_id

item_id

interact_type

interaction types such as posting comments, etc.

date

timestamp of the behavior

Dataset 3: Power Transformer Dataset (ETDataset)

Data set download:

Data description:

The data provides two years of data. Each data point is recorded every minute (marked with m ). They are from the same country in China. Two different regions in a province are named ETT-small-m1 and ETT-small-m2. Each dataset contains 2 years * 365 days * 24 hours * 4 = 70,080 data points. In addition, we also provide an hourly granularity of data set variants (marked with h ), namely ETT-small-h1 and ETT-small- h2. Each data point contains 8-dimensional features, including the recording date of the data point, the predicted value "oil temperature" and 6 different types of external load values.

Dataset 4: 2016 Electrician Mathematical Modeling Competition Load Forecasting Dataset

Data set download:

Data introduction:

Data set 5: Wind turbine operation data set

Data set download:

Data introduction:

The data set includes more than 300,000 items including wind speed, wind direction, temperature, humidity, air pressure and real power.

  • WINDSPEED: Forecast wind speed
  • WINDDIRECTION: wind direction
  • TEMPERATURE: temperature
  • HUMIDITY: Humidity
  • PRESSURE: air pressure
  • PREPOWER: Predict power
  • ROUND(A.WS,1): actual wind speed
  • ROUND(A.POWER,0): actual power
  • YD15: Actual power prediction target already available

Dataset 6: Australian electricity load and price forecast data

Data set download:

Data introduction:

The data set includes date, hour, dry bulb temperature, dew point temperature, wet bulb temperature, humidity, electricity price, and power load characteristics, with a time interval of 30 minutes.

Dataset 7: Changzhou Bridgestone Photovoltaic Dataset

Data set download:

Data introduction:

The data set includes five features: time, station name, irradiation intensity (Wh/㎡), ambient temperature (℃), and full-field power (kW), with a time interval of 5 minutes. (Note: There is a space before the irradiation intensity (Wh/㎡), ambient temperature (℃), and full-field power (kW) feature names) a>

Data set 8: Xinjiang photovoltaic wind power data set

Data set download:

  • Baidu Netdisk: Link: https://pan.baidu.com/s/1e3NkiNC_dg3CaZWe9TA1TA?pwd=loam Extraction code: loam 

Introduction to photovoltaic data:

PhotovoltaicThe data set includes component temperature (℃), temperature (°), air pressure (hPa), humidity (%), total radiation (W/m2), direct radiation (W /m2), scattered radiation (W/m2), actual power generation (mw) characteristics, time interval 15min.

Guess you like

Origin blog.csdn.net/qq_41921826/article/details/134372421