Point cloud filtering, downsampling, clustering, segmentation and sorting

1. Point cloud filtering and noise reduction algorithm

Statistical filtering:

Concept: Remove outliers that are obviously sparsely distributed. Remove points outside the variance based on the given mean and variance.

Steps: For each point, calculate the distance of the nearest K points and find the average. At this time, each point in the point cloud has an average value. Calculate all means and variances, and then filter out outliers based on the 123 variance threshold.

When the mean distance of the k nearest neighbors of a judgment point is greater than 1 times the global standard deviation + the mean distance (global distances meanm and standard), it is an outlier.

Straight pass filter:

According to the attributes of the point cloud, set a range to filter. For example, the common xyz range.

Radius filter:

Set a radius R and count the number of points within R. If it is less than the threshold, it is an outlier.

Median filter:

Used to remove noise from images or signals. The specific implementation is to replace the pixel value with the intermediate value in the sliding window to implement filtering.

Because it selects the median as a proxy, it is not affected by outliers.

Mean filter:

Replacing pixel values ​​by calculating the average value within the domain window achieves the effect of denoising, but it will destroy image details and make the image blurry. It has the effect of smoothing images and signals.

Gaussian filter:

Used to smooth images and reduce noise in images. First determine the size and standard deviation of the Gaussian kernel. The standard deviation determines the shape and smoothness of the Gaussian kernel. Generate a two-dimensional Gaussian weight matrix based on these two and perform convolution. The larger the standard deviation, the fuzzier it is. The larger the Gaussian kernel, the more blurry it is.

2. Point cloud downsampling algorithm

1. Random downsampling

2. Voxel downsampling

There are two types. After converting the point cloud into voxels, you can randomly select a point in each voxel, which is faster, and you can also find an average value in the voxels.

3. Downsampling at the furthest point

Select a point each time, and next time find the remaining point farthest from it, and continue iterating. It is easily affected by noise points and some dense points can be removed.

3. Point cloud clustering algorithm

1.K-means clustering algorithm

K is unknown and needs to be set manually. You can use the elbow method to traverse the size of K and choose the one with the lowest loss.

It is more sensitive to the initial point. You can choose several initialization methods and finally choose the one with the smallest loss.

For the problem of noise points, you can select the point with the shortest distance to other points in the class as the center point instead of averaging.

Algorithm process: First randomly select K center points, and then calculate which center point is closest to each point and which category it belongs to. Then the average value of each class is calculated as the new center point and the iteration starts. Until a certain number of times or the center point changes very little.

The algorithm complexity is: t (number of iterations) * k (number of center points) * N (total number of points) * d (dimension of each point).

2.Mean-shift clustering algorithm

Randomly select a point as the center of the sphere with a radius of R, then find the average of the points within the radius as the new center point, and continue iterating like this. Until the center point no longer changes or changes very little. All points in the process fall into this category. Then determine whether the distance between the converged center point and the existing center point is greater than the threshold, and merge if not.

Because meanshift has to calculate the distance between this center point and all other points every iteration, it is very time-consuming. Our idea is to find the existing center point for all points, which takes very little time.

Simplified version: Traverse the data, calculate the distance between the data and the existing center point, classify it into one category if it is less than the threshold, and then update the new center value of the category.

3.DBSCAN clustering algorithm

Clustering algorithm with built-in filtering. First set the radius R. Those with less than a certain number of points in the radius are noise points, and the rest are core points. Based on the density-based clustering algorithm, the core points within the radius of the core point are in the same category, and then iterate other core points within R. for this category. Until there is only this core point in R. Remove such points from all points and continue the above iteration.

R selection skills: Generally, select a point first, calculate the distance between it and all other points, and then sort it to find a place that has changed a lot before and after, and then R selects the mutation point. If you select this too large, there will be fewer clusters. If you select it too small, there will be more clusters. You can adjust it appropriately.

The value of min_point is generally too small, so you can try it multiple times.

4. Point cloud segmentation algorithm

1.Ransac

It is a random sampling algorithm that continuously randomly samples from the sample and then calculates the outlier error. As long as the number of interior points belonging to the sampling equation is greater than a certain threshold, it will exit directly.

#include <iostream>
#include <vector>
#include <cmath>
#include <random>

struct Point {
  double x;
  double y;
};

// 计算两点之间的距离
double distance(const Point& p1, const Point& p2) {
  return std::sqrt(std::pow(p2.x - p1.x, 2) + std::pow(p2.y - p1.y, 2));
}

// 使用 RANSAC 算法拟合直线
void ransacLineFitting(const std::vector<Point>& points,
                      int maxIterations,
                      double distanceThreshold,
                      double inlierRatioThreshold,
                      std::vector<Point>& inliers,
                      double& slope,
                      double& intercept) {
  std::random_device rd;
  std::mt19937 rng(rd());
  std::uniform_int_distribution<int> uni(0, points.size() - 1);

  int bestInlierCount = 0;

  for (int iteration = 0; iteration < maxIterations; ++iteration) {
    // 随机选择两个点
    int index1 = uni(rng);
    int index2 = uni(rng);

    const Point& p1 = points[index1];
    const Point& p2 = points[index2];

    // 计算斜率和截距
    slope = (p2.y - p1.y) / (p2.x - p1.x);
    intercept = p1.y - slope * p1.x;

    // 用于保存内点
    std::vector<Point> currentInliers;

    for (const Point& point : points) {
      // 计算点到直线的距离
      double d = std::abs(point.y - slope * point.x - intercept);

      // 判断是否是内点
      if (d < distanceThreshold) {
        currentInliers.push_back(point);
      }
    }

    // 更新最佳内点数和内点集合
    if (currentInliers.size() > bestInlierCount) {
      bestInlierCount = currentInliers.size();
      inliers = std::move(currentInliers);

      // 计算内点比例
      double inlierRatio = static_cast<double>(bestInlierCount) / points.size();

      // 如果内点比例达到阈值,提前退出循环
      if (inlierRatio > inlierRatioThreshold) {
        break;
      }
    }
  }
}

int main() {
  // 构造示例数据点云
  std::vector<Point> points {
    {0.0, 0.5},
    {1.0, 2.0},
    {2.0, 3.5},
    {3.0, 4.5},
    {4.0, 6.0},
    {5.0, 7.5},
    {6.0, 9.0},
    {7.0, 10.5}
  };

  // RANSAC 参数
  int maxIterations = 1000;       // 最大迭代次数
  double distanceThreshold = 0.5; // 内点阈值
  double inlierRatioThreshold = 0.8; // 内点比例阈值

  // 输出结果保存变量
  std::vector<Point> inliers;
  double slope, intercept;

  // 使用 RANSAC 拟合直线
  ransacLineFitting(points, maxIterations, distanceThreshold, inlierRatioThreshold,
                    inliers, slope, intercept);

  // 输出拟合结果
  std::cout << "拟合直线方程:y = " << slope << "x + " << intercept << std::endl;
  std::cout << "内点数量:" << inliers.size() << std::endl;

  return 0;
}

Guess you like

Origin blog.csdn.net/slamer111/article/details/131750126