CS231n--assignment 1--KNN

写在开头:今天花了一天时间终于把斯坦福大学cs231n课程的assignment1中的knn(K-Nearest Neighbor)相关代码给实现了,完成的过程中参考了网上的相关资源,其实算法不是很难,关键是对numpy模块的语法不够熟悉,以后有机会再写一篇介绍numpy语法的文章

KNN算法介绍:
输入:X_train(5000*3072:5000张图片,每行数据32*32代表pixel)
X_test(500*3072)
y_train(5000) 对5000张图片的label描述
y_test(500)

首先就是找到每张测试图与所有训练图片的L2距离(欧式距离)

可按如下方法实现:

def compute_distances_no_loops(self, X):
"""
Compute the distance between each test point in X and each training point in self.X_train using no explicit loops.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train)) 
dists = np.multiply(np.dot(X,self.X_train.T),-2) 
sq1 = np.sum(np.square(X),axis=1,keepdims = True) 
sq2 = np.sum(np.square(self.X_train),axis=1) 
dists = np.add(dists,sq1) 
dists = np.add(dists,sq2) 
dists = np.sqrt(dists) 
return dists

返回结果dists(500*5000)
选取与每张测试图片距离最小的k张训练图片对其求直方图,找到直方图最高的那一项对应label就是预测的图片标记

代码如下

def predict_labels(self, dists, k=1):
    """
    Given a matrix of distances between test points and training points,
    predict a label for each test point.

    Inputs:
    - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
      gives the distance betwen the ith test point and the jth training point.

    Returns:
    - y: A numpy array of shape (num_test,) containing predicted labels for the
      test data, where y[i] is the predicted label for the test point X[i].  
    """
    num_test = dists.shape[0]
    y_pred = np.zeros(num_test)
    for i in xrange(num_test):
    # A list of length k storing the labels of the k nearest     neighbors to
    # the ith test point.
    closest_y = []
    closest_y = self.y_train[np.argsort(dists[i,:])[:k]] 
    y_pred[i] = np.argmax(np.bincount(closest_y))

猜你喜欢

转载自blog.csdn.net/archervin/article/details/52988276
今日推荐