Interpolation concepts and methods

Interpolation refers to using the information of known data points to construct a curve or surface passing through these points, so as to predict or estimate between known data points. Interpolation methods ensure that the interpolated curve passes through known data points, so predictions or estimates at those points are accurate. However, interpolation methods cannot guarantee the accuracy of prediction or estimation results in areas outside the known data points, because the interpolation curve may exhibit oscillations or irrational behavior.

Common interpolation methods are as follows:

Linear interpolation: Interpolation is performed using a straight line between two adjacent data points. Linear interpolation is simple and intuitive, but can produce discontinuities or rough curves between data points.
Polynomial interpolation: Uses polynomial functions to fit curves between data points. Common polynomial interpolation methods include Lagrange interpolation and Newton interpolation.
Lagrangian interpolation works by constructing a polynomial function such that the function is exactly the same as the original function at known data points. Lagrangian interpolation can ensure that the interpolation curve passes through known data points, but oscillation may occur in high-order interpolation.
Newton interpolation constructs a recursive polynomial so that the polynomial is exactly the same as the original function on known data points. Newton interpolation can ensure that the interpolation curve passes through known data points, and oscillation is not easy to occur in high-order interpolation.
Quadratic interpolation: Quadratic interpolation is to construct a quadratic polynomial through known three data points to fit the data. It accurately predicts values between known data points, and the interpolation curve is smoother than the linear interpolation curve. However, quadratic interpolation can only fit three data points and may not be accurate enough for more complex data distributions. When choosing an interpolation method, you need to weigh its advantages and disadvantages on a case-by-case basis.
Spline interpolation: Spline interpolation is an interpolation method that approximates data by using piecewise continuous low-degree polynomials to obtain a smooth curve. The basic idea of spline interpolation is to divide the interpolation interval into multiple small segments, and use a low-degree polynomial for interpolation in each small segment. The most common spline interpolation method is cubic spline interpolation. The spline interpolation curve is smooth and continuous, which can better approximate the actual curve, and the properties of the curve can be controlled by adjusting the boundary conditions, such as curvature and slope. However, the computational complexity of spline interpolation is high, especially in the case of large data sets or high-dimensional data. In addition, spline interpolation needs to determine the boundary conditions, and different boundary conditions may lead to different interpolation results.
Cubic spline interpolation: Cubic spline interpolation is a commonly used interpolation method, which approximates the original data by constructing a set of cubic polynomials on given data points, thereby realizing data interpolation. The basic idea of cubic spline interpolation is to divide the interpolation interval into several small segments, and each segment uses a cubic polynomial for interpolation. The coefficients of these polynomials are determined by the interpolation condition (to ensure that the interpolation polynomial passes through a given data point) and the smoothness condition (to ensure that the function value and derivative value of adjacent interpolation polynomials are continuous at the interpolation point). By solving these conditions, the coefficients of the cubic polynomial on each small segment can be obtained, thereby completing the interpolation. The advantage of cubic spline interpolation is that it can better maintain the characteristics of the original data and has better smooth properties. However, it has a high computational complexity and needs to solve a large number of linear equations, so the balance between computational efficiency and accuracy needs to be considered in practical applications.
Least squares interpolation: The goal of least squares interpolation is to find a function that minimizes the sum of squares of the errors of the function at a given data point. Its advantage is that the form of the interpolation function can be flexibly selected, and the parameters of the fitting function can be optimized by the least square method to obtain better fitting results. However, least squares interpolation also has some limitations, such as when the data points are few or unevenly distributed, it may lead to inaccuracy of the fitting function. In addition, the least squares interpolation can only interpolate within the range of known data points, and cannot estimate independent variables beyond the data range.

Code demonstration of several interpolation methods

cubic spline interpolation

import numpy as np
from scipy.interpolate import interp1d


# wavelength和absorbance是以列表形式存储的对齐的数据
wavelength_arr = np.array(wavelength)  # 转成numpy数组
absorbance_arr = np.array(absorbance)  # 转成numpy数组

# 定义插值函数；'cubic'为三次样条插值，'linear'为线性插值，'quadratic'为二次插值
interpolator = interp1d(wavelength_arr, absorbance_arr, kind='cubic')  # 三个参数分别为自变量、因变量、插值类型

# 定义要计算的插值数据的自变量
interpolated_wavelength_arr = np.linspace(470, 645, 175)  # 自变量范围为470-645，共175个自变量值

# 进行插值计算，得到插值后的absorbance数据
interpolated_absorbance_arr = interpolator(interpolated_wavelength_arr)

least squares interpolation

import numpy as np
from scipy.optimize import least_squares


# wavelength和absorbance是以列表形式存储的对齐的数据
wavelength_arr = np.array(wavelength)  # 转成numpy数组
absorbance_arr = np.array(absorbance)  # 转成numpy数组

# 选择插值函数，例如多项式插值
def polynomial_func(x, coeffs):
    return np.polyval(coeffs, x)  # 多项式求解，两个参数分别为多项式系数和自变量

# 构建拟合方程
def fit_func(coeffs, x, y):
    return polynomial_func(x, coeffs) - y

# 初始化多项式系数
initial_coeffs = np.zeros(3)

# 定义插值方法
result = least_squares(fit_func, initial_coeffs, args=(wavelength_arr, absorbance_arr))

# 获取拟合参数
fit_coeffs = result.x

# 定义要计算的插值数据的自变量
interpolated_wavelength_arr = np.linspace(470, 645, 175)

# 进行插值计算，得到插值后的absorbance数据
interpolated_absorbance_arr = polynomial_func(interpolated_wavelength_arr, fit_coeffs)

Linear interpolation (custom function version)

import numpy as np


def linear_interpolation(x, y, x_new):
    # 确保输入的x和y是一维数组，并且长度相同
    x_arr = np.array(x)
    y_arr = np.array(y)
    assert x_arr.ndim == 1 and y_arr.ndim == 1 and len(x_arr) == len(y_arr)

    # 根据输入的x和y计算斜率
    slopes = np.diff(y_arr) / np.diff(x_arr)

    # 根据斜率和x_new计算插值结果
    y_new = y_arr[:-1] + slopes * (x_new - x_arr[:-1])

    return y_new


x_list = [1, 2, 3, 4, 5]
y_list = [2, 4, 6, 8, 10]
x_new_list = [1.5, 2.5, 3.5, 4.5]

y_new = linear_interpolation(x_list, y_list, x_new_list)

print(y_new)
---------
[3. 5. 7. 9.]

Fitting concepts and methods

Fitting refers to finding a curve or surface through the information of known data points, so that the deviation of the curve or surface from the known data points is the smallest. The goal of fitting is to find a simple function or model that approximately describes the overall trend of the data. The fitting method can make predictions or estimates in areas other than known data points, but the prediction or estimation results may have certain errors.

Common fitting methods are as follows:

Linear Fitting: Builds a linear model that fits the data by minimizing the squared difference between the observed values and the linear model.
Nonlinear fitting: methods for fitting nonlinear functions, such as curve fitting, surface fitting, etc. Nonlinear fitting methods usually use iterative optimization algorithms to minimize the error between the fitting curve and the data points by continuously adjusting the model parameters.
Least squares fitting: Find the optimal fitting curve by minimizing the sum of squared errors between the data points and the fitting curve. Least squares fitting can be used to fit different types of curves such as linear functions, polynomial functions, and exponential functions.
Interpolation method: Fit data by interpolation between known data points. Common interpolation methods include linear interpolation, Lagrangian interpolation, spline interpolation, etc. During interpolation, we assume some functional relationship between known data points and use this functional relationship to fill in the values of unknown data points. Therefore, interpolation can be thought of as a method of fitting, inferring the value of unknown data points by fitting a function between known data points.

The advantage of fitting is that an approximate description of the whole can be found, and predictions or estimates can be made in areas outside the known data points. However, there may be certain errors in the results of the fitting method on known data points, because the fitted curve is only an approximate description of the data and does not necessarily pass through all known data points.

Summarize

The difference between interpolation and fitting is that the interpolation method can ensure that the known data points pass through, so the result is accurate at these points, but may not be accurate in the area outside the known data points; while the fitting method is to find a whole It is possible to make predictions or estimates in areas other than known data points, but the results may have certain errors. Interpolation can be thought of as a method of fitting.

Several common interpolation methods

Interpolation concepts and methods

Code demonstration of several interpolation methods

Fitting concepts and methods

Summarize

Guess you like