看了很多tensorflow 的KNN代码,基本就是下面原版代码的变种。很多人认为Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)
这行代码就是超参数K=5000. 但我仍有疑问以下是我的见解,不正之处请指出。
**
不知道是不是该这样理解。它这里选了5000个训练样本,200个测试样本。然后遍历200个测试样本,每次循环选出一个测试样本然后在5000个样本中找它离得的最近的—>见这个代码
**
# 获得距离最小的index
pred = tf.argmin(distance, 0)
**
所以本质上应该是K=1的KNN。 如果是K=5000的话,那么按照KNN原本的定义,应该是离这个测试样本中最近5000个训练样本中通过投票表决最多的那个类
**
下面是原版代码
'''
A nearest neighbor learning algorithm example using TensorFlow library.
This example is using the MNIST database of handwritten digits
(http://yann.lecun.com/exdb/mnist/)
Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''
from __future__ import print_function
import numpy as np
import tensorflow as tf
# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# In this example, we limit mnist data
Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)
Xte, Yte = mnist.test.next_batch(200) #200 for testing
# tf Graph Input
xtr = tf.placeholder("float", [None, 784])
xte = tf.placeholder("float", [784])
# Nearest Neighbor calculation using L1 Distance
# Calculate L1 Distance
distance = tf.reduce_sum(tf.abs(tf.add(xtr, tf.negative(xte))), reduction_indices=1)
# Prediction: Get min distance index (Nearest neighbor)
pred = tf.arg_min(distance, 0)
accuracy = 0.
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# loop over test data
for i in range(len(Xte)):
# Get nearest neighbor
nn_index = sess.run(pred, feed_dict={xtr: Xtr, xte: Xte[i, :]})
# Get nearest neighbor class label and compare it to its true label
print("Test", i, "Prediction:", np.argmax(Ytr[nn_index]), \
"True Class:", np.argmax(Yte[i]))
# Calculate accuracy
if np.argmax(Ytr[nn_index]) == np.argmax(Yte[i]):
accuracy += 1./len(Xte)
print("Done!")
print("Accuracy:", accuracy)