Machine Learning Notes - Mathematical Expression of the Curse of Dimensionality

1. The distance between points

        The kNN classifier assumes that similar points may also have the same label. However, in high-dimensional spaces, points drawn from probability distributions tend not to always be close together. We can illustrate this with a simple example. We will draw points uniformly and randomly within a unit cube (as shown) and study how much space the k nearest neighbors of a test point within that cube will take up.

        Think of the unit cube [ 0 , 1 ]^d. All training data is sampled uniformly within this cube, i.e. \forall i, x_i \in [ 0 , 1 ]^d, and we are considering the k = 10 nearest neighbors of such a test point.

        Let ℓ be the side length of the smallest hypercube containing all k-nn of the test point. \ell^d\approx\frac{k}{n},

Guess you like

Origin blog.csdn.net/bashendixie5/article/details/133200339