I am sorry that the writing is not detailed enough, so that some beginners cannot know the source of the data. This article is annotated based on the previous mathematical model - the population growth model (based on python) , and mainly explains how the individual parameters that need to be calculated are obtained in the previous article.
The libraries that need to be imported are as follows:
from scipy.integrate import odeint
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
The data of the previous article and this article are as follows. where t represents the year.
data = pd.DataFrame({
"x": [3.9, 5.3, 7.2, 9.6, 12.9, 17.1, 23.2, 31.4, 38.6, 50.2, 62.9, 76, 92, 105.7,
122.8, 131.7, 150.7, 179.3, 203.2, 226.5, 248.7, 281.4],
"t": range(22),
"r": [0.2949, 0.3113, 0.2986, 0.2969, 0.2907, 0.3012, 0.3082, 0.2452, 0.2435, 0.242,
0.2051, 0.1914, 0.1614, 0.1457, 0.1059, 0.1059, 0.1579, 0.1464, 0.1161, 0.1004,
0.1104, 0.1349]
})
1. For the initial growth model
It is assumed here that the population growth rate is a fixed value (of course, it is definitely not in reality, and this assumption will be released step by step in the future). According to the ordinary least squares regression, the parameters can be obtained . The algorithm function is as follows. (The specific pushing process of the algorithm is not included here, and a dedicated article will be made later)
def ols(x, y):
x_mean = np.mean(x) # x的均值
y_mean = np.mean(y)
x_mean_square = np.array([(i - x_mean) ** 2 for i in x]) # 计算xi-x_bar的平方
y_mean_square = np.array([(i - y_mean) ** 2 for i in y]) # 计算yi-y_bar的平方
xy_mean = np.array([((i - x_mean) * (j - y_mean)) for i, j in zip(x, y)])
b = xy_mean.sum() / x_mean_square.sum()
a = y_mean - b * x_mean
fitvalue = np.array([])
fitvalue = np.append(fitvalue, [a + b * i for i in x]) # 计算拟合值
y_pred_square = np.array([(i - j) ** 2 for i, j in zip(y, fitvalue)]) # 残差平方和SSR
SE = np.sqrt(y_pred_square.sum() / (len(y) - 2))
R_square = 1 - np.array(y_pred_square.sum() / np.sum([(i - y_mean) ** 2 for i in y]))
return a, b, fitvalue, SE, R_square
This function inputs an independent variable x and a dependent variable y. In the output result, a represents the intercept, b represents the regression coefficient, fitvalue represents the value of the dependent variable fitted by the model, and SE represents the standard error. When you need to use this model for prediction Or when explaining the significance of the regression coefficient (significance: whether the independent variable can really affect the dependent variable), you have to observe this value carefully. R_square represents the coefficient of determination of the fitting model. The closer this coefficient is to 1, the better the model fits. It also means that there is an obvious linear relationship between x and y.
According to the following code, it can be obtained
x = data.x
y = np.log(x)
x0 = 6.0496
t = data.t
a, b, fitvalue, SE, r2 = ols(t, y)
print("截距:{}\n回归系数:{}\n标准误:{}\nR方值:{}".format(a, b, SE, r2))
2. Growth model with improved growth rate
Here, the previous assumption of "the growth rate is constant" has been released, and the variable r in the above data set data is the change in the population growth rate. Through the following code, we can see that the output result of the model will be found at this time, and it will be found to be negative, but it is not a wrong calculation. In the previous article, the assumed change rate equation is already with a negative sign, so The result is nothing wrong.
r = data.r
r0, r1, f1, se, R2 = ols(y=r, x=t)
print(r0, r1)
3. Logistics model
Unfortunately, due to the initial value and other reasons, I cannot reproduce the values given in the book very well, but here I still give the estimation process of nonlinear least squares, I hope readers will forgive me.
code show as below:
from scipy.optimize import curve_fit
def func(x, R, xm):
return R * (1 - x / xm)
x = data['x']
r = data['r']
params = curve_fit(func, x, r)
r0 = params[0][0]
xm = params[0][1]
print(r0, xm)
def logistics(x, t):
return np.array(r0 * x * (1-x/xm) + 0 * t)
T = np.arange(0, 30, 1)
x = odeint(logistics, 3.9, T)
plt.scatter(data['t'], data['x'], c='r')
plt.plot(x)
plt.show()
The two parameters calculated according to the above code are respectively , and the result of substituting into the model is shown in the figure below.
The gap between the graphs given in the previous article on this fitting graph is not too large, but it is not as good as the previous one.
Summarize
Due to the limited ability of the author, the final logistics model obtained is somewhat inconsistent with the original text, but readers should pay more attention to the fitting process of model parameters and the idea of modeling.
This article is over! If you have any questions, please leave a message in the comment area or private message, thank you for your support.