Prophet Algorithm Framework Trend Model, Seasonal Model Principle Detailed Explanation and Application Practice

This article is completed with the assistance of ChatGPT, which improves the writing speed and efficiency.

1. Trend model

1.1. Overview of trend models

When we talk about trend models in Prophet, we can understand it as a way of describing overall trends in time series data. Trend models can tell us how data changes over time, whether it grows, decreases, or remains stable.

In Prophet, there are two common forms of trend models:

  • Linear trend model: The linear trend model assumes that the growth or decrease of data occurs at a constant rate, that is, the data changes linearly with time in the form of a straight line. This model is suitable for those data showing a continuous increasing or decreasing trend. For example, the sales volume of a product increases linearly with time.

  • Logistic regression growth model: The logistic regression growth model assumes that the growth or decrease of the data occurs in a saturated manner, that is, after a certain point in time, the growth rate of the data gradually slows down and tends to be stable. This model is suitable for data that shows a saturated growth trend, such as population growth or market penetration. The logistic regression growth model uses an S-shaped curve to describe the growth process of the data, which grows rapidly at first but gradually slows down over time.

By choosing an appropriate trend model form, Prophet can more accurately capture trend changes in the data and provide reliable forecast results. According to the characteristics of the data and the actual situation behind it, we can choose a linear trend model or a logistic regression growth model to model the overall trend of the data. This way, we can better understand and predict trends in data over time.

In the practical application of Prophet, the nonlinear trend model includes two forms: Saturation Growth Model and Piecewise Logistic Growth Model.

  • Saturated growth model: The saturated growth model assumes that the trend is saturated, that is, the trend will gradually slow down and stabilize over time. This kind of model is usually used to describe some natural growth processes, such as population growth, product market penetration rate, etc. Saturated growth models can model trends by using an S-curve (usually a Logistic function).

  • Segmented Logistic Regression Growth Model: A segmented logistic regression growth model assumes that trends have different growth rates over different time periods. It divides the trend into segments and applies a logistic regression model to each segment to model it. This model is suitable for some trends with sudden or periodic changes, such as the impact of marketing activities, policy changes, etc.

1.2. Principle of trend model

1.2.1. Principle of nonlinear trend model

The logistic regression growth model uses the Logistic function to describe the growth process of the data. The Logistic function is usually called the Sigmoid function, and its mathematical expression is as follows:

g ( t ) = C 1 + e x p ( − k ( t − m ) ) g(t)=\frac{C}{1+exp(-k(t-m))} g(t)=1+exp(k(tm))C

where g ( t ) g(t)g ( t ) means at timettPredicted value of growth trend at t , CCC represents the upper limit of the trend,kkk represents the growth rate parameter,mmm represents the midpoint location parameter of the trend.

But in the real world, the function C , k , m C, k, mC,k,The three parameters of m cannot all be constant, but are likely to change with time. Therefore, in Prophet, the author considers replacing all three parameters with functions that change with time. That isC ( t ) , k ( t ) , m ( t ) C(t), k(t), m(t)C(t),k(t),m ( t ) . In addition, in the actual time series, the trend of the curve will definitely not remain unchanged. At certain specific times or with some potential periodic curves, the curve will change. At this time, some scholars will go to Research change point detection, also known as change point detection.

It can be assumed that there is such a vector: δ ∈ R s \mathbf{\delta \in \mathbb{R}^s }dRs , where:γ j \gamma_jcjIndicates that at timestamp sj s_jsjThe amount of change in the growth rate. If the initial growth rate we use kkk instead, at timettThe growth rate on t is k + ∑ j : t > sj δ jk + \sum_{j:t>s_j}\delta_jk+j:t>sjdj

γ j = ( sj − m − ∑ l < j γ l ) ( 1 − k + ∑ l < j γ lk + ∑ l ≤ j γ l ) \gamma_j = (s_j - m - \sum_{l<j}\ gamma _l)(1 - \frac{k+\sum_{l<j}\gamma_l }{k+\sum_{l\le j}\gamma_l})cj=(sjml<jcl)(1k+ljclk+l<jcl)

The piecewise logistic regression growth model is:

g ( t ) = C ( t ) 1 + e x p ( − ( k + a ( t ) T δ ) ( t − ( m + a ( t ) T γ ) ) ) g(t)=\frac{C(t)}{1+exp(-(k+a(t)^T\delta )(t-(m+a(t)^T\gamma )))} g(t)=1+exp((k+a(t)Tδ)(t(m+a(t)T c)))C(t)

When using Prophet's growth = 'logistic', you need to set C ( t ) C(t) in advanceC(t)

1.2.2. The model based on piecewise linear function is as follows:

In the piecewise linear function model, the trend of the data is divided into multiple time periods, and each time period has a corresponding linear function to describe the change of the data. The advantage of this model is that it can more flexibly capture the trend of data changes and adapt to the speed of change in different time periods.

g ( t ) = ( k + a ( t ) T δ ) t + ( m + a ( t ) T γ ) g(t)=(k+a(t)^T\delta )t + (m+a(t)^T\gamma ) g(t)=(k+a(t)Tδ)t+(m+a(t)T c)

Among them, kkk represents the growth rate (growth rate),δ \deltaδ represents the change in growth rate,mmm represents the midpoint position parameter offset parameter of the trend.

γ j = − sj δ j \gamma_j = -s_j\delta_jcj=sjdjNote: This is not the same setting as in the previous logistic regression function.

How to calculate the split point?

  • Analysts define empirically.
  • Equally divided by month and year
  • Randomly select the segmentation interval, and the segmentation interval obeys the Laplace distribution δ j ∼ L aplace ( 0 , τ ) \delta_j \sim Laplace(0, \tau )djLaplace(0,t ) .

1.3. Trend model application method

1.3.1. Basic method

The trend model form of Prophet is determined by the setting of the growth parameter. When growth is set to linear, the trend model form is linear, that is, the trend grows or decreases in a linear fashion. When growth is set to logistic, the trend model form is a logistic regression growth model, that is, an S-shaped curve is used to model the change of the trend.

In Prophet, the form of the trend model can be specified by setting the growth parameter when the model is defined.

  • linear: Indicates that a linear trend model is used. This means that the trend is modeled as a linear function, i.e. increasing or decreasing at a constant rate over time.

  • logistic: Indicates the use of a logistic regression growth model. This means that the trend is modeled as an S-curve, characterized by saturated growth.

In actual operation, you can choose the appropriate trend model form according to your data characteristics and forecasting needs. If your data shows obvious characteristics of saturated growth or break points, consider using a logistic regression growth model (growth='logistic'). If your data shows a linear growth or decrease trend, you can use a linear trend model (growth='linear').

For example, the following is an example of setting the form of a trend model when defining a Prophet model:

from prophet import Prophet

# 定义线性趋势模型
model_linear = Prophet(growth='linear')

# 定义逻辑回归增长模型
model_logistic = Prophet(growth='logistic')

In the above example, model_linear uses a linear trend model, while model_logistic uses a logistic regression growth model to model the trend. You can choose the appropriate trend model form according to the actual situation to obtain more accurate forecast results.

1.3.2. Piecewise linear models

There are two methods in the program,

  • One is to specify the position of the change point manually;
  • The other is to automatically select through an algorithm.

In the default function, Prophet will select n_changepoints = 25 changepoints, and then set the changepoint range to the first 80% (changepoint_range), that is, the changepoint will be set in the first 80% of the time series. Through the set_changepoints function in forecaster.py, we can know that the first thing to check is whether some boundary conditions are reasonable, such as whether the number of points in the time series is less than n_changepoints, etc.; secondly, if the boundary conditions are met, the position of the change point is evenly distributed. It can be seen through the np.linspace function.

parameter describe
growth It refers to the trend function of the model. There are currently two values, linear and logistic
changepoints Refers to a specific date on which the trend of the model will change. And changepoints refers to all the dates of potential changepoints, if not specified, the model will automatically identify them.
n_changepoints The maximum number of Changepoints. If changepoints is not None, this parameter does not take effect.
changepoint_range It refers to the time range in which the changepoint appears in the historical data. It is used in conjunction with n_changeponits. The changepoint_range determines that the changepoint can appear at the time point closest to the current time. The larger the changepont_range, the closer the changepoint can appear. When changepoints is specified, this parameter does not take effect
changepoint_prior_scale Set the flexibility of automatic mutation point selection, the larger the value, the easier the changepoint will appear

1.3.3. (Manually) segmenting cases by month

changepoints=['2023-04-01', '2023-05-01', '2023-06-01']
model = Prophet(growth='linear',changepoint_prior_scale=5,yearly_seasonality=False, holidays=holidays, changepoints=changepoints)

model.fit(train)
future = model.make_future_dataframe( periods=48, freq='30min', include_history=True)  # 1天
forecast = model.predict(future)

fig = model.plot(forecast)
a = add_changepoints_to_plot(fig.gca(), model, forecast)

# 找到突变时间线
threshold = 0.01
signif_changepoints = model.changepoints[
    np.abs(np.nanmean(model.params['delta'], axis=0)) >= threshold
] if len(model.changepoints) > 0 else []
signif_changepoints

Mutation timeline.
insert image description here
Draw a component diagram.

fig = model.plot_components(forecast)

insert image description here
As shown in the trend chart above, electricity consumption was stable in April, electricity consumption decreased in May when the weather became warmer, and electricity consumption increased in June when the weather became hotter.

2. Seasonal term model

2.1. The principle of seasonal item model

In Prophet, in order to capture periodic changes in time series data, the author introduces a seasonal component, which uses the idea of ​​Fourier series. By representing periodic variation as a linear combination of multiple sine and cosine functions, Prophet is able to better model and predict periodic patterns in data.

Specifically, Prophet uses a set of Fourier series to represent the seasonality of time series data. Each Fourier term consists of a sine and a cosine function and has its own amplitude and phase parameters. The overall periodic seasonal component is obtained by linearly combining these Fourier terms.

This Fourier series modeling method enables Prophet to flexibly adapt to the periodic patterns of different time series, and can automatically detect and capture the main periodic features in the data. By combining periodic modeling with trend modeling, Prophet provides a comprehensive time series forecasting framework.

s ( t ) = ∑ n = 1 N ( anchors ( 2 π nt P ) + bnsin ( 2 π nt P ) s(t)=\sum_{n=1}^{N} (a_ncos(\frac{2\ pi nt}{P})+b_nsin(\frac {2\pi nt}{P})s(t)=n=1N(ancos(P2 πn t)+bnsin(P2 πn t)

P = 365.25 P=365.25 P=365.25 represents a yearly cycle orP = 7 P=7P=7 represents a period of one week. In short, by truncating the Fourier series, the frequency range of seasonality can be controlled, thereby tuning the ability to fit rapidly changing seasonal patterns. For annual and weekly seasonality, the authors recommend setting the truncation parameter toN=10 N=10N=10 andN=3 N=3N=3 , these values ​​are found experimentally to perform well in most problems. For parameter selection, a model selection process such as AIC can be used to automate it.

For example, N = 10 N=10 for each seasonal itemN=10 X ( t ) = [ c o s ( 2 π ( 1 ) t 365.25 ) , . . . , s i n ( 2 π ( 10 ) t 365.25 ) ] X(t)=[cos(\frac{2\pi (1)t}{365.25}),...,sin(\frac{2\pi (10)t}{365.25})] X(t)=[cos(365.252π(1)t),...,sin(365.252π(10)t)]

Therefore, the seasonal term of the time series is: s ( t ) = X ( t ) β s(t)=X(t)\betas(t)=X ( t ) β,而β \betaβ的initialization是β ∼ N normal ( 0 , σ 2 ) \beta \sim Normal(0,\sigma^2 )bNormal(0,p2 ). Hereσ \sigmaσ is controlled by seasonality_prior_scale, that is to sayσ = seasonality _ prior _ scale \sigma = seasonality\_prior\_scalep=se a so na l i t y _ p r i or _ sc a l e . The larger the value, the more obvious the effect of the season; the smaller the value, the less obvious the effect of the season. At the same time, in the code, seasonality_mode also corresponds to two modes, namely addition and multiplication, and the default is addition. In open source code,X ( t ) X(t)The X ( t ) function is constructed by fourier_series.

2.2. Default parameters of Prophet

def __init__(
    self,
    growth='linear',
    changepoints=None,
    n_changepoints=25, 
    changepoint_range=0.8,
    yearly_seasonality='auto',
    weekly_seasonality='auto',
    daily_seasonality='auto',
    holidays=None,
    seasonality_mode='additive',
    seasonality_prior_scale=10.0,
    holidays_prior_scale=10.0,
    changepoint_prior_scale=0.05,
    mcmc_samples=0,
    interval_width=0.80,
    uncertainty_samples=1000,
):

The explanation is as follows:

  • growth: This parameter controls the type of trend. The optional values ​​are linear and logistic, and the default value is linear. If the trend of the data is linear, use linear; if the trend of the data is sigmoid, use logistic.
  • changepoints: This parameter specifies the position of the trend change points. You can pass in a list of dates, or use some of Prophet's built-in change point detection algorithms, such as 'auto', 'linear', 'logistic', etc.
  • n_changepoints: This parameter specifies the number of trend change points. The default value is 25, which can be adjusted according to the characteristics of the data.
  • changepoint_range: This parameter specifies the position range of the trend change point. The default value is 0.8, which means that the position range of the trend change point is the last 80% of the data. Can be set to other values ​​such as 0.5.
  • changepoint_prior_scale: This parameter controls the flexibility of the model, which determines how well the model can adapt to the data. Smaller values ​​will make the model smoother, larger values ​​will make the model more sensitive. The default value is 0.05.
  • yearly_seasonality: This parameter controls whether to include the annual seasonality component. The default value is True.
  • weekly_seasonality: This parameter controls whether to include the weekly seasonality component. The default value is True.
  • daily_seasonality: This parameter controls whether to include daily seasonality components. The default value is False.
  • holidays: This parameter specifies the holiday dates. You can pass in a DataFrame containing holiday dates, or use some of Prophet's built-in holidays, such as US holidays, Thanksgiving, Christmas, etc.
  • seasonality_mode: This parameter controls how the seasonality component is calculated. The optional values ​​are additive and multiplicative, and the default value is additive. If the seasonality of the data is constant over time, use additive; if the seasonality is increasing or decreasing over time, use multiplicative.
  • seasonality_prior_scale: This parameter controls the flexibility of the seasonality component, which determines the impact of the seasonality component on the forecast. Smaller values ​​will reduce the influence of the seasonal component, and larger values ​​will increase the influence of the seasonal component. The default value is 10.0.
  • holidays_prior_scale: This parameter controls the flexibility of the holiday component, which determines the impact of holidays on the forecast. Smaller values ​​will reduce the effect of the holiday component, and larger values ​​will increase the effect of the holiday component. The default value is 10.0.
  • interval_width: This parameter controls the width of the confidence interval. The default value is 0.8, indicating an 80% confidence interval. Can be set to 0.95 or other values.
  • seasonality_prior_scale: This parameter controls the flexibility of the seasonality component, which determines the impact of the seasonality component on the forecast. Smaller values ​​will reduce the influence of the seasonal component, and larger values ​​will increase the influence of the seasonal component. The default value is 10.0.
  • seasonality_prior_scale_seasonality: This parameter controls the prior distribution of the seasonal component. A dictionary can be passed specifying the prior distribution for each seasonal component. For example, {'weekly': 5, 'yearly': 10} means that the prior distribution for the weekly seasonal component is 5, and the prior distribution for the annual seasonal component is 10.
  • seasonality_prior_scale_holidays: This parameter controls the prior distribution of the holiday component. A dictionary can be passed specifying the prior distribution for each holiday component.
  • holidays_prior_scale: This parameter controls the flexibility of the holiday component, which determines the impact of holidays on the forecast. Smaller values ​​will reduce the effect of the holiday component, and larger values ​​will increase the effect of the holiday component. The default value is 10.0.
  • growth_prior_scale: This parameter controls the prior distribution of the trend component. Smaller values ​​will make the trend component less influential, and larger values ​​will make the trend component more influential. The default value is 0.05.

2.3. Custom season cycle

When using Prophet for time series forecasting, specific periodic patterns can be captured by adding custom periods. The following is a case where the custom period is 10 days:

# 初始化数据,略。
# 创建Prophet模型
model = Prophet()
model.add_seasonality(name='custom_seasonality', period=10, fourier_order=5)

# 拟合数据
model.fit(train)

# 创建未来日期
future_dates = model.make_future_dataframe(periods=48)

# 预测
forecast = model.predict(future_dates)

# 绘制预测结果
model.plot(forecast)

In the above example, we use some time series data from a real project. Then, we created a Prophet model and added a seasonal component with a custom period of 10 days, where the period parameter specifies the length of the period, which is set to 10 here, and the fourier_order parameter specifies the order of the Fourier series, which is set here for 5.

Next, we used the fitted data for model training and created dates 30 days into the future using the make_future_dataframe method. Finally, the prediction result is obtained by calling the predict method, and the graph of the prediction result is drawn using the plot method.

This allows us to use custom periods in our forecasts and model and forecast 10-day periodicity in the data.
insert image description here
The seasonal item is shown in the figure below:
insert image description here
The newly added 10-day periodic seasonal item is shown in the figure below:
insert image description here
Please note that according to your actual data and needs, you can customize different cycle lengths and Fourier series orders as needed.

3. Notes

3.1. Laplace distribution

has a density function: f ( x ) = 1 2 γ e − ∣ x − μ ∣ γ f(x) = \frac{1}{2\gamma}e^{- \frac{|x-\mu |}{ \gamma}}f(x)=2 c1ecxμThe distribution of is called the Laplace distribution.

Among them, μ \muμ is called the positional parameter,γ \gammaγ is called the scale parameter. The expectation of the Laplace distribution isμ \muμ , the variance is2 γ 2 2\gamma^22 c2

The probability density of the Laplace distribution looks a lot like the normal distribution, draw the standard Laplace distribution ( γ = 1 \gamma=1c=1 ) and a probability density plot of the standard normal distribution:
insert image description here

3.2. AIC

Akaike information criterion, Akaike information criterion, or AIC for short, is a standard for measuring the goodness of statistical model fitting, which was created and developed by Japanese statistician Akaike Hiroji. Akaike's information criterion is based on the concept of entropy, which can weigh the complexity of the estimated model and the goodness of the model's fitting data.

In general, AIC can be expressed as:

A I C = 2 k − 2 L n AIC=\frac{2k-2L}{n} AIC=n2 k 2 L

Its assumption is that the errors of the model obey an independent normal distribution. where: kkk is the number of parameters in the fitted model,LLL is the log-likelihood,nnn is the number of observations.

reference:

Mathematics Life. Research on Prophet, Facebook's Time Series Prediction Algorithm . Zhihu. 2018.12

Model perspective. Time series forecasting Prophet model and Python implementation . Zhihu. 2023.03

Xiao Yongwei. Prophet Time Series Forecasting Framework Introductory Practice Notes . CSDN Blog. 2023.06

Guess you like

Origin blog.csdn.net/xiaoyw/article/details/131453662