tensorflow KNN与NN之争

看了很多tensorflow 的KNN代码，基本就是下面原版代码的变种。很多人认为`Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)`这行代码就是超参数K=5000. 但我仍有疑问以下是我的见解，不正之处请指出。

不知道是不是该这样理解。它这里选了5000个训练样本，200个测试样本。然后遍历200个测试样本，每次循环选出一个测试样本然后在5000个样本中找它离得的最近的—>见这个代码

# 获得距离最小的index
pred = tf.argmin(distance, 0)

所以本质上应该是K=1的KNN。如果是K=5000的话，那么按照KNN原本的定义，应该是离这个测试样本中最近5000个训练样本中通过投票表决最多的那个类

下面是原版代码

'''
A nearest neighbor learning algorithm example using TensorFlow library.
This example is using the MNIST database of handwritten digits
(http://yann.lecun.com/exdb/mnist/)

Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''

from __future__ import print_function

import numpy as np
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# In this example, we limit mnist data
Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)
Xte, Yte = mnist.test.next_batch(200) #200 for testing

# tf Graph Input
xtr = tf.placeholder("float", [None, 784])
xte = tf.placeholder("float", [784])

# Nearest Neighbor calculation using L1 Distance
# Calculate L1 Distance
distance = tf.reduce_sum(tf.abs(tf.add(xtr, tf.negative(xte))), reduction_indices=1)
# Prediction: Get min distance index (Nearest neighbor)
pred = tf.arg_min(distance, 0)

accuracy = 0.

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:

    # Run the initializer
    sess.run(init)

    # loop over test data
    for i in range(len(Xte)):
        # Get nearest neighbor
        nn_index = sess.run(pred, feed_dict={xtr: Xtr, xte: Xte[i, :]})
        # Get nearest neighbor class label and compare it to its true label
        print("Test", i, "Prediction:", np.argmax(Ytr[nn_index]), \
            "True Class:", np.argmax(Yte[i]))
        # Calculate accuracy
        if np.argmax(Ytr[nn_index]) == np.argmax(Yte[i]):
            accuracy += 1./len(Xte)
    print("Done!")
    print("Accuracy:", accuracy)

tensorflow KNN与NN之争

看了很多tensorflow 的KNN代码，基本就是下面原版代码的变种。很多人认为Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)这行代码就是超参数K=5000. 但我仍有疑问以下是我的见解，不正之处请指出。

不知道是不是该这样理解。它这里选了5000个训练样本，200个测试样本。然后遍历200个测试样本，每次循环选出一个测试样本然后在5000个样本中找它离得的最近的—>见这个代码

所以本质上应该是K=1的KNN。 如果是K=5000的话，那么按照KNN原本的定义，应该是离这个测试样本中最近5000个训练样本中通过投票表决最多的那个类

下面是原版代码

猜你喜欢

看了很多tensorflow 的KNN代码，基本就是下面原版代码的变种。很多人认为`Xtr, Ytr = mnist.train.next_batch(5000) #5000 for training (nn candidates)`这行代码就是超参数K=5000. 但我仍有疑问以下是我的见解，不正之处请指出。

所以本质上应该是K=1的KNN。如果是K=5000的话，那么按照KNN原本的定义，应该是离这个测试样本中最近5000个训练样本中通过投票表决最多的那个类