Numpy's broadcast mechanism efficiently calculates the pairwise distance between matrices

Using numpy can easily calculate the distance between two two-dimensional arrays. The distance between two-dimensional arrays is defined as: the dimension of X is (a,c), the dimension of Y is (b,c), and Z is the distance array from X to Y, and the dimension is (a,b). And Z[0,0] is the distance from X[0] to Y[0]. Z(m,n) is the distance from X[m] to Y[n].

For example: Calculate the Euclidean distance between each row of m 2 and n*2 in the matrix of m 2 and the matrix of n * 2.

#computer the distance between text point x and train point x_train
import numpy as np
X = np.random.random((3,2))
X_train = np.random.random((5,2))
print('X:')
print(X)
print('X_train:')
print(X_train)
 
dist = np.zeros((X.shape[0],X_train.shape[0]))
print('--------------------')
#way 1:use two loops ,使用两层循环
for i in range(X.shape[0]):
    for j in range(X_train.shape[0]):
        dist[i,j] = np.sum((X[i,:]-X_train[j,:])**2)
print('way 1 result:')
print(dist)
 
#way 2:use one loops ,使用一层循环
for i in range(X.shape[0]):
    dist[i,:] = np.sum((X_train-X[i,:])**2,axis=1)
print('--------------------')
print('way 2 result:')
print(dist)
 
#way 3:use no loops,不使用循环
dist = np.reshape(np.sum(X**2,axis=1),(X.shape[0],1))+ np.sum(X_train**2,axis=1)-2*X.dot(X_train.T)
print('--------------------')
print('way 3 result:')
print(dist)

Guess you like

Origin blog.csdn.net/ZauberC/article/details/129325182