[Data analysis] Predictive analysis using machine learning algorithms (5): Prophet (2021-01-21)

Machine learning methods in time series forecasting (5): Prophet

This article is the fifth article in the series of "Machine Learning Methods in Time Series Forecasting". If you are interested, you can read the previous article first:
[Data Analysis] Using Machine Learning Algorithms for Predictive Analysis (1): Moving Average (Moving Average) Average)
[Data analysis] Predictive analysis using machine learning algorithms (2): Linear Regression
[Data analysis] Predictive analysis using machine learning algorithms (3): K-Nearest Neighbours
[Data analysis] Predictive analysis using machine learning algorithms (4): Autoregressive differential moving average model (AutoARIMA)

1. Introduction to Prophet

Prophet is a method of predicting time series data based on an additive model, where the nonlinear trend is consistent with seasonal changes in the year, week, and day, and holiday effects. It is suitable for time series with strong seasonal influence and historical data of multiple seasons. Prophet is robust to missing data and trend changes, and can usually handle outliers well. Its advantages are:

  • Accurate and fast. Prophet is used in many Facebook applications to generate reliable planning and goal setting forecasts. In most cases, its performance is better than other methods.
  • Fully automatic. No manual intervention is required to obtain reasonable predictions about messy data. Prophet is robust to outliers, missing data, and sharp changes in the time series.
  • Adjustable. Prophet provides users with many possibilities to adjust and adjust forecasts. You can use some parameters to add relevant knowledge in your own research field to improve the prediction effect.

Prophet official website: https://facebook.github.io/prophet/
Tutorial: https://facebook.github.io/prophet/docs/quick_start.html#python-api

2. Stock price prediction example

The data set is the same as the previous four articles, and the purpose is to compare the prediction effects of different algorithms on the same data set. The data set and code are on my GitHub , and friends who need it can download it by themselves.

Import the package and read in the data. The fbprophet package is difficult to install under Windows. It is recommended to install it in the conda virtual environment. If the installation is not successful, please check whether you have successfully installed the pystan package. Official website installation tutorial .

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from fbprophet import Prophet
df = pd.read_csv('NSE-TATAGLOBAL11.csv')

Set the index. In order not to destroy the original data, rebuild a new_data.

# setting the index as date
df['Date'] = pd.to_datetime(df.Date,format='%Y-%m-%d')
df.index = df['Date']

#creating dataframe with date and the target variable
data = df.sort_index(ascending=True, axis=0)
new_data = pd.DataFrame(index=range(0,len(df)),columns=['Date', 'Close'])

for i in range(0,len(data)):
    new_data['Date'][i] = data['Date'][i]
    new_data['Close'][i] = data['Close'][i]

new_data['Date'] = pd.to_datetime(new_data.Date,format='%Y-%m-%d')
new_data.index = new_data['Date']
#preparing data
new_data.rename(columns={
    
    'Close': 'y', 'Date': 'ds'}, inplace=True)
new_data

Insert picture description here
The original data is divided into training set and test set.

#train and validation
train = new_data[:987]
valid = new_data[987:]

Adapt the model.

#fit the model
model = Prophet()
model.fit(train)

Make predictions

#predictions
close_prices = model.make_future_dataframe(periods=len(valid))
forecast = model.predict(close_prices)

Extract the required data.

forecast_valid = forecast['yhat'][987:]

Observe the magnitude of RMSE.

rmse = np.sqrt(np.mean(np.power((np.array(valid['y'])-np.array(forecast_valid)),2)))
rmse 

Insert picture description here
Plot to observe the forecast.

#plot
valid['Predictions'] = 0
valid['Predictions'] = forecast_valid.values

plt.figure(figsize=(16,8))
plt.plot(train['y'])
plt.plot(valid[['y', 'Predictions']])
plt.show()

Insert picture description here
As can be seen from the figure, Prophet's prediction effect on this data is not very good.

Guess you like

Origin blog.csdn.net/be_racle/article/details/112973932