euclidean distance 和 cosine distance 欧几里得距离和余弦相似度

  1. 为什么探讨余弦相似度?
    During our ImageNet experiments, we found that euclidean distance led to poor performance when computing margins for the EVM. This is consistent with the previous finding that euclidean distance does not work well when comparing deep features of individual samples

参考文献:
C. C. Aggarwal, A. Hinneburg, and D. A. Keim, “On the surprising behavior of distance metrics in high dimensional space,” in Proc. Int. Conf. Database Theory, 2001, pp. 420–434.

  1. 余弦相似度公式:
    余弦相似度公式
  2. 先入为主地参考一下 Mr_EvanChen的博客,介绍的不错,还有python代码实现过程:
# consine相似度求解
import numpy as np
from scipy.spatial.distance import pdist
# 构造两个10维的数据:x,y
x=np.random.random(10)
y=np.random.random(10)
# solution1
dist1 = 1 - np.dot(x,y)/(np.linalg.norm(x)*np.linalg.norm(y))
# solution2
dist2 = pdist(np.vstack([x,y]),'cosine') 
print('x',x)
print('y',y)
print('dist1',dist1)
print('dist2',dist2)
  1. 图解余弦相似度,

图片来自参考文献:
刘建学, 李守军. 基于余弦相似度的因子分析在食品成分检测中的应用[J]. 食品科学, 2005, 26(6).

在这里插入图片描述

发布了34 篇原创文章 · 获赞 17 · 访问量 2万+

猜你喜欢

转载自blog.csdn.net/weixin_39393430/article/details/89786020