1. The distance between points
The kNN classifier assumes that similar points may also have the same label. However, in high-dimensional spaces, points drawn from probability distributions tend not to always be close together. We can illustrate this with a simple example. We will draw points uniformly and randomly within a unit cube (as shown) and study how much space the k nearest neighbors of a test point within that cube will take up.
Think of the unit cube . All training data is sampled uniformly within this cube, i.e. , and we are considering the k = 10 nearest neighbors of such a test point.
Let ℓ be the side length of the smallest hypercube containing all k-nn of the test point. ,