RANSAC principle and quadratic/cubic polynomial curve fitting

RANSAC principle

RANSAC (RANdom SAmple Consensus) is a classic model fitting algorithm used to find the best model from a set of messy data. Its basic idea is to randomly select a certain number of data points, use these data points to fit the model, and then bring all the data points into the model, and count the number of data points that fit the model. If the number of fits exceeds the threshold, these data are considered Points fit this model, i.e. they are inlier. Repeat the above process, after multiple iterations, the best model found is the model that fits the best, and the data points that conform to the model are internal points.

The algorithm flow of RANSAC is as follows:

  1. Randomly select a certain number (for example, n) of sample points as interior points to calculate model parameters;
  2. Iterate over all data points and calculate their distance to the model;
  3. Divide the data points whose distance is less than the threshold as inner points, and the remaining points as outer points;
  4. If the number of inliers is greater than a certain proportion (for example, half) and larger than the number of inliers in the current record, recalculate the model parameters and record the number of inliers;
  5. Repeat the above steps until the specified number of iterations is reached or the number of interior points exceeds a certain proportion.

The RANSAC algorithm is usually used to deal with fitting problems containing noise or abnormal data, such as point cloud registration, image matching and other problems. Its advantage is that it can filter noise data and improve the fitting accuracy. The disadvantage is that parameters such as threshold and iteration number need to be set, and the adjustment process is relatively complicated.

Fitting a quadratic polynomial curve

The following is a simple code example of RANSAC fitting a quadratic polynomial for reference:

import numpy as np
import random

def ransac2d(x, y, n, k, t, d):
    """
    RANSAC算法拟合二次多项式
    :param x: x坐标值
    :param y: y坐标值
    :param n: 最小拟合数据量
    :param k: 迭代次数
    :param t: 阈值
    :param d: 拟合数据量偏差
    :return: 拟合出的二次多项式系数
    """
    bestfit = None
    besterr = np.inf
    for i in range(k):
        # 随机选择n个点
        indices = random.sample(range(len(x)), n)
        # 使用这n个点进行拟合
        p = np.polyfit(x[indices], y[indices], 2)
        # 计算距离小于阈值t的点的数量
        err = np.sum(np.abs(np.polyval(p, x) - y) < t)
        # 如果符合条件的点数超过了d个,认为该次拟合比之前的更好
        if err > d and err < besterr:
            bestfit = p
            besterr = err
    return bestfit

# 生成测试数据
x = np.linspace(-10, 10, 100)
y = 2 * x ** 2 - 3 * x + 1 + np.random.randn(x.shape[0]) * 10

# 使用RANSAC算法拟合二次多项式
p = ransac2d(x, y, 10, 100, 3, 20)

# 绘制拟合曲线
import matplotlib.pyplot as plt
plt.plot(x, y, 'b.')
xp = np.linspace(-10, 10, 100)
plt.plot(xp, np.polyval(p, xp), 'r-')
plt.show()

In this code, we first define a ransac2dfunction to realize the function of RANSAC algorithm to fit quadratic polynomials. This function accepts the x and y coordinate values, the minimum fitted data size n, the number of iterations k, the threshold t and the fitted data size deviation d as parameters, and returns the fitted quadratic polynomial coefficients.

Inside the function, we loop k times to execute the RANSAC algorithm. In each loop, we randomly select n points and use these points to do a quadratic polynomial fit. We then count the number of points whose distance is less than a threshold t and compare it to the fitted data volume deviation d. If there are more than d number of eligible points, we consider it a better fit than before and save it.

Finally, we plot the raw data points and the fitted curve for visualization.

Fitting a cubic polynomial curve

The following is a code example of implementing RANSAC to fit a cubic polynomial in Python:

import random
import numpy as np
import matplotlib.pyplot as plt

# 生成随机样本数据
x = np.arange(-5, 5, 0.1)
y = 5 * x ** 3 - 2 * x ** 2 + 3 * x + np.random.randn(len(x))

def fit_polynomial(x, y, degree):
    # 返回拟合多项式的系数
    return np.polyfit(x, y, degree)

def evaluate_polynomial(coef, x):
    # 计算多项式函数值
    return np.polyval(coef, x)

def ransac_polynomial(x, y, degree, n_iter, threshold):
    best_inliers = None
    best_coef = None
    best_err = np.inf
    for i in range(n_iter):
        # 随机选择若干个样本点
        sample_indices = random.sample(range(len(x)), degree + 1)
        sample_x = x[sample_indices]
        sample_y = y[sample_indices]
        # 拟合多项式
        coef = fit_polynomial(sample_x, sample_y, degree)
        # 计算所有样本点到多项式的距离
        all_errors = np.abs(evaluate_polynomial(coef, x) - y)
        # 选择符合阈值内的样本点
        inliers = all_errors < threshold
        num_inliers = np.sum(inliers)
        # 如果当前符合阈值的样本点数量比之前的多,则更新最佳参数
        if num_inliers > degree and num_inliers > np.sum(best_inliers):
            best_inliers = inliers
            best_coef = fit_polynomial(x[inliers], y[inliers], degree)
            best_err = np.sum(np.abs(evaluate_polynomial(best_coef, x[inliers]) - y[inliers])))
    return best_coef, best_err, best_inliers

# 进行RANSAC拟合
degree = 3
n_iter = 100
threshold = 0.5
best_coef, best_err, best_inliers = ransac_polynomial(x, y, degree, n_iter, threshold)

# 画出拟合曲线和数据点
plt.plot(x, y, 'o')
plt.plot(x[best_inliers], y[best_inliers], 'ro', alpha=0.5)
plt.plot(x, evaluate_polynomial(best_coef, x), '-r', label='RANSAC', alpha=0.5)
plt.plot(x, evaluate_polynomial(fit_polynomial(x, y, degree), x), '-g', label='Ordinary Least Squares')
plt.legend()
plt.show()

In this code example, we use the numpy.polyfitand numpy.polyvalfunction to perform polynomial fitting and polynomial function value calculation respectively, and define fit_polynomialtwo evaluate_polynomialfunctions. The function to realize RANSAC fitting ransac_polynomialis to calculate the fitting polynomial after randomly selecting several sample points, then calculate the distance from all sample points to the polynomial, and select the sample points within the threshold for fitting, and finally return the value within the threshold The best parameters and errors, as well as the internal and external states of all points, that is, whether the sample points meet the threshold.

In addition, in this code example we also use matplotlibthe library to draw the fitted curve and data points.
The following is the C++ code to implement RANSAC to fit a cubic polynomial curve:

#include <iostream>
#include <cmath>
#include <vector>
#include <random>
#include <algorithm>

using namespace std;

// 定义点结构体
struct Point {
    
    
    double x;
    double y;
};

// 定义三次多项式函数
double cubic_poly(double a, double b, double c, double d, double x) {
    
    
    return a * pow(x, 3) + b * pow(x, 2) + c * x + d;
}

// 定义误差计算函数
double calc_error(double a, double b, double c, double d, Point& p) {
    
    
    double x = p.x;
    double y = p.y;
    double error = cubic_poly(a, b, c, d, x) - y;
    return error * error;
}

// 定义RANSAC函数
void ransac(vector<Point>& points, double& a, double& b, double& c, double& d, int iterations, double threshold, int inlier_count) {
    
    
    int best_inliers = -1;
    int num_points = points.size();
    random_device rd;
    mt19937 gen(rd());
    uniform_int_distribution<> distrib(0, num_points - 1);
    uniform_real_distribution<> rand(-1., 1.);
    vector<int> inliers(num_points);

    for (int i = 0; i < iterations; ++i) {
    
    
        int idx1 = distrib(gen), idx2 = distrib(gen), idx3 = distrib(gen);
        while (idx1 == idx2 || idx1 == idx3 || idx2 == idx3) {
    
    
            idx1 = distrib(gen);
            idx2 = distrib(gen);
            idx3 = distrib(gen);
        }

        // 从三个点中获得三次多项式系数的初始估计
        double x1 = points[idx1].x, y1 = points[idx1].y;
        double x2 = points[idx2].x, y2 = points[idx2].y;
        double x3 = points[idx3].x, y3 = points[idx3].y;
        double aa = ((y2 - y3) * (x1 - x3) + (y3 - y1) * (x2 - x3)) / ((x1 - x2) * (x1 - x3) * (x2 - x3));
        double bb = ((y2 - y3) - aa * (x2 * x2 - x3 * x3) - aa * (x2 - x3)) / (x2 - x3);
        double cc = y1 - aa * x1 * x1 - bb * x1;
        double dd = -bb / (3.0 * aa);

        int cur_inliers = 0;
        for (int k = 0; k < num_points; ++k) {
    
    
            double error = calc_error(aa, bb, cc, dd, points[k]);
            if (error < threshold) {
    
    
                inliers[cur_inliers] = k;
                ++cur_inliers;
            }
        }

        if (cur_inliers > inlier_count) {
    
    
            // 使用inliers重新估计三次多项式系数
            double sum_x = 0.0, sum_x2 = 0.0, sum_x3 = 0.0, sum_x4 = 0.0, sum_y = 0.0, sum_xy = 0.0, sum_x2y = 0.0;
            for (int j = 0; j < cur_inliers; ++j) {
    
    
                Point& p = points[inliers[j]];
                double x = p.x;
                double y = p.y;
                sum_x += x;
                sum_x2 += x * x;
                sum_x3 += x * x * x;
                sum_x4 += x * x * x * x;
                sum_y += y;
                sum_xy += x * y;
                sum_x2y += x * x * y;
            }
            double det = (cur_inliers * sum_x2 * sum_x4 + 2 * sum_x * sum_x2 * sum_x3
                - sum_x2 * sum_x2 * sum_x2 - cur_inliers * sum_x3 * sum_x3 - sum_x * sum_x4);
            a = (cur_inliers * sum_xy * sum_x4 + sum_x2 * sum_x3 * sum_y + sum_x2y * sum_x3
                - sum_y * sum_x2 * sum_x4 - sum_xy * sum_x3 * sum_x - sum_x2y * sum_x2 * cur_inliers) / det;
            b = (sum_xy * sum_x2 * sum_x2 + cur_inliers * sum_x3 * sum_y * sum_x2 + sum_x2y * sum_x2 * sum_x2
                - sum_x2 * sum_xy * sum_x - sum_x2y * sum_x3 * cur_inliers - sum_y * sum_x2 * sum_x2) / det;
            c = (sum_x2 * sum_xy * sum_x3 + sum_x2 * sum_x2y * sum_x2 + sum_y * sum_x2 * sum_x4
                - sum_x4 * sum_xy * cur_inliers - sum_x2y * sum_x3 * sum_x2 - sum_x2 * sum_y * sum_x3) / det;
            d = (-sum_x2 * sum_x2 * sum_x2y - cur_inliers * sum_x2 * sum_x3 * sum_y - sum_xy * sum_x2 * sum_x4
                + sum_x4 * sum_x2y * cur_inliers + sum_xy * sum_x3 * sum_x2 + sum_x2 * sum_y * sum_x2y) / det;
            best_inliers = cur_inliers;
        }
    }
}

int main() {
    
    
    vector<Point> points = {
    
    {
    
    0.0, 1.0}, {
    
    1.0, 4.0}, {
    
    2.0, 7.0}, {
    
    3.0, 16.0}, {
    
    4.0, 19.0},
                            {
    
    5.0, 28.0}, {
    
    6.0, 37.0}, {
    
    7.0, 46.0}, {
    
    8.0, 55.0}, {
    
    9.0, 64.0}};
    double a, b, c, d;
    ransac(points, a, b, c, d, 1000, 1.0, 6);
    cout << "a = " << a << ", b = " << b << ", c = " << c << ", d = " << d << endl;
    return 0;
}

Here we define a point structure Point, a cubic polynomial function cubic_poly, an error calculation function calc_error and a RANSAC function ransac. In the main function, we define a point set points, and call the ransac function to calculate the cubic polynomial coefficients a, b, c and d.

It should be noted that when we calculate the initial cubic polynomial coefficients, we use the method of random three points. Then, we calculate the error of each point to the current estimated cubic polynomial curve, and judge whether it is an inlier according to the threshold. If the number of interior points is greater than the specified number, we use these interior points to re-estimate the cubic polynomial coefficients.

This code is simple sample code and needs to be modified according to your specific situation.

The following is a sample code for RANSAC fitting a cubic polynomial curve implemented in C++:

#include <iostream>
#include <vector>
#include <random>
#include <cmath>

using namespace std;

// 生成[l, r]之间的随机整数
int randomInt(int l, int r) {
    
    
    random_device rd;
    mt19937 eng(rd());
    uniform_int_distribution<int> dist(l, r);
    return dist(eng);
}

// 计算一组三次多项式系数
vector<double> polyfit(vector<double> x, vector<double> y) {
    
    
    int n = x.size();
    int m = 3;  // 三次多项式

    vector<vector<double>> A(m + 1, vector<double>(m + 1, 0));
    vector<double> B(m + 1, 0);

    for (int i = 0; i < n; i++) {
    
    
        double xi = x[i], yi = y[i];
        for (int j = 0; j <= m; j++) {
    
    
            for (int k = 0; k <= m; k++) {
    
    
                if (j == 0 && k == 0) {
    
    
                    A[j][k] += 1;
                } else {
    
    
                    A[j][k] += pow(xi, j + k);
                }
            }
            B[j] += pow(xi, j) * yi;
        }
    }

    // 高斯-约旦消元
    for (int i = 0; i <= m; i++) {
    
    
        double d = A[i][i];
        for (int j = i; j <= m; j++) {
    
    
            A[i][j] /= d;
        }
        B[i] /= d;
        for (int j = i + 1; j <= m; j++) {
    
    
            double d2 = A[j][i];
            for (int k = i; k <= m; k++) {
    
    
                A[j][k] -= d2 * A[i][k];
            }
            B[j] -= d2 * B[i];
        }
    }

    vector<double> res(m + 1);
    for (int i = m; i >= 0; i--) {
    
    
        for (int j = i + 1; j <= m; j++) {
    
    
            B[i] -= A[i][j] * res[j];
        }
        res[i] = B[i];
    }

    return res;
}

// 计算多项式函数的值
double polyval(vector<double> c, double x) {
    
    
    double res = 0;
    for (int i = c.size() - 1; i >= 0; i--) {
    
    
        res = res * x + c[i];
    }
    return res;
}

// RANSAC拟合三次多项式曲线
vector<double> ransacPolyfit(vector<double> x, vector<double> y, int nIter, double inlierThreshold, int minInliers) {
    
    
    int n = x.size();
    int m = 3;  // 三次多项式
    int bestNInliers = 0;
    vector<double> bestModel;
    vector<int> bestInlierIndices;

    for (int i = 0; i < nIter; i++) {
    
    
        // 随机选择n个点
        vector<int> indices(n);
        for (int j = 0; j < n; j++) {
    
    
            indices[j] = randomInt(0, n - 1);
        }

        // 拟合三次多项式曲线
        vector<double> xSample(n), ySample(n);
        for (int j = 0; j < n; j++) {
    
    
            xSample[j] = x[indices[j]];
            ySample[j] = y[indices[j]];
        }
        vector<double> model = polyfit(xSample, ySample);

        // 计算拟合误差和内点个数
        vector<int> inlierIndices;
        int nInliers = 0;
        for (int j = 0; j < n; j++) {
    
    
            double dist = abs(polyval(model, x[j]) - y[j]);
            if (dist < inlierThreshold) {
    
    
                inlierIndices.push_back(j);
                nInliers++;
            }
        }

        // 更新最优模型和内点
        if (nInliers > bestNInliers && nInliers >= minInliers) {
    
    
            bestNInliers = nInliers;
            bestModel = model;
            bestInlierIndices = inlierIndices;
        }
    }

    // 使用所有内点重新拟合模型
    int nInliers = bestInlierIndices.size();
    vector<double> xInliers(nInliers), yInliers(nInliers);
    for (int i = 0; i < nInliers; i++) {
    
    
        xInliers[i] = x[bestInlierIndices[i]];
        yInliers[i] = y[bestInlierIndices[i]];
    }
    return polyfit(xInliers, yInliers);
}

int main() {
    
    
    // 生成随机数据
    const int n = 100;
    vector<double> x(n), y(n);
    for (int i = 0; i < n; i++) {
    
    
        x[i] = i;
        y[i] = 3 * pow(i - n / 2, 3) - 1000 * pow(i - n / 2, 2) + 50000 * (i - n / 2) + randomInt(-50000, 50000);
    }

    // RANSAC拟合
    int nIter = 1000;
    double inlierThreshold = 1000;
    int minInliers = 50;
    vector<double> c = ransacPolyfit(x, y, nIter, inlierThreshold, minInliers);

    // 输出拟合结果
    cout << "拟合结果:y = " << c[0] << " + " << c[1] << "x + " << c[2] << "x^2 + " << c[3] << "x^3" << endl;

    return 0;
}

In the above code, a function is first defined polyfitto fit a set of polynomial coefficients, and a function polyvalis used to calculate the value of the polynomial function.

Then a function is defined ransacPolyfitfor RANSAC to fit a cubic polynomial curve. This function first randomly selects n points in the data set, uses polyfitthe function to fit a cubic polynomial curve, and calculates the fitting error and the number of interior points. The model with the most inliers is then kept across all iterations. Finally refits the model using all inliers and returns the fitted polynomial coefficients.

In the main function, first generate a set of random data. Then specify the number of RANSAC iterations, the interior point threshold and the minimum number of interior points. Finally, call ransacPolyfitthe function to fit the cubic polynomial curve and output the fitting result.

Note: Since RANSAC will successfully fit the correct model in most cases, no exception handling code is added in this example. In practical applications, it may be necessary to add handling of abnormal conditions.

Guess you like

Origin blog.csdn.net/qq_39506862/article/details/130899912