Deep Clustering Algorithms depth from the encoder (Deep Autoencoder) MATLAB Interpretation

Deep Clustering Algorithms

Author: Kailugaji - blog Park  http://www.cnblogs.com/kailugaji/

    This paper routes: Depth from the encoder (Deep Autoencoder) -> Deep Embedded Clustering (DEC) -> Improved Deep Embedded clustering (IDEC) -> Deep Fuzzy K-means (DFKM), wherein Deep Autoencoder has a depth from the encoder ( deep Autoencoder) MATLAB interpretation is mentioned, there are many methods to improve the depth from the encoder, does not explain in detail, talk about the depth of focus of clustering algorithms. If wrong, correct look.

    Network Architecture Figure depth clustering algorithm

    Loss of function of depth clustering algorithm

1. Deep Embedded Clustering

1.1 Stochastic Neighbor Embedding (SNE)

    SNE is a nonlinear dimensionality reduction strategy, there is a nonlinear correlation between the two characteristics, mainly used for data visualization, the PCA (principal component analysis component) is a linear dimensionality reduction strategy, there is a linear correlation between two feature sex. SNE using Gauss in the original space (high dimensional spatial) distribution of the distance metric between the data points into a conditional probability, by using the Gauss mapping space (low-dimensional space) distribution measures the distance between the mapping point into the conditional probability conditional probability using the KL divergence to minimize the high-dimensional space in the low dimensional space.

    There are two problems facing SNE: (1) KL divergence is an asymmetric measure, (2) congestion. For non-symmetric problem, Pij is defined, the metric conversion asymmetric symmetric metric. But symmetry measure still faces overcrowding, mapped into low-dimensional space, it can not be separated well according to the characteristics of the data itself between the mapping point.

    对于拥挤问题(The Crowding Problem)的解决,提出t-SNE,一种非线性降维策略,主要用于可视化数据。引入厚尾部的学生t分布,将低维空间映射点之间的距离度量转化为概率分布t分布qij,使得不同簇之间的点能很好地分开。

1.2 t-SNE

1.3 Deep Embedded Clustering(DEC)

    受t-SNE的启发,提出DEC算法,重新定义原始空间(高维空间)的度量pij。微调阶段,舍弃掉编码器层,最小化KL散度作为损失函数,迭代更新参数。

2. Improved Deep Embedded Clustering(IDEC)

    DEC丢弃解码器层,并使用聚类损失Lc对编码器进行微调。作者认为这种微调会扭曲嵌入空间,削弱嵌入特征的代表性,从而影响聚类性能。因此,提出保持解码器层不变,直接将聚类损失附加到嵌入空间。

3. Deep Fuzzy K-means

    Deep Fuzzy K-means同样在低维映射空间中加入聚类过程,将特征提取与聚类同时进行,引入熵加权的模糊K-means,不采用原来的欧氏距离,而是自己重新定义度量准则,权值偏置的正则化项防止过拟合,提高泛化能力。

4. 参考文献

[1] Maaten L, Hinton G. Visualizing data using t-SNE[J]. Journal of machine learning research, 2008, 9(Nov): 2579-2605.

[2] Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of machine learning research, 2010, 11(Dec): 3371-3408.

[3] Xie J, Girshick R, Farhadi A. Unsupervised deep embedding for clustering analysis[C]//International conference on machine learning. 2016: 478-487.

[4] Guo X, Gao L, Liu X, et al. Improved deep embedded clustering with local structure preservation[C]//IJCAI. 2017: 1753-1759.

[5] Zhang R, Li X, Zhang H, et al. Deep Fuzzy K-Means with Adaptive Loss and Entropy Regularization[J]. IEEE Transactions on Fuzzy Systems, 2019.

[6] t-SNE相关资料:t-SNE完整笔记An illustrated introduction to the t-SNE algorithm从SNE到t-SNE再到LargeVist-SNE算法-CSDN

[7] DEC与IDEC的Python代码-Github

[8] DFKM的Python代码-Github

Guess you like

Origin www.cnblogs.com/kailugaji/p/12105939.html