The Essence and Principles of Getting Started with "SOM Neural Network"

Original article, reprint please indicate that it is from "Old Cake Explains Neural Network" : bp.bbbdata.com

About "Old Cake Explains Neural Network":

This website explains the knowledge, principles and codes of neural networks in a structured way.

Reproduce the algorithm of matlab neural network toolbox, it is a good assistant for learning neural network. 


Table of contents

1. Explanation of the entry principle

01. Clustering algorithm based on Kohonen rule  

02. The idea of ​​SOM clustering  

03. Topology diagram of SOM neural network  

04. Model expression of SOM  

Afterword  

2. SOM-code rewriting (single sample training)

01. Code Structure Description  

02. Interpretation of code running results

03. Specific code



SOM neural network (Self-organizing Feature Map) is a neural network for clustering proposed by Kohonen in 1981. It is a classic, important and widely used member of the neural network family.

The first section of this article focuses on clarifying what SOM is, what problems it solves, what the idea is, and what features it has
. Own code logic reproduces the effect of the matlab toolbox.

  author language  


SOM is not a difficult algorithm, but it is a difficult problem to explain SOM clearly.

The author once wanted to finish an article on SOM, knead left and right, but later found out that the faster the picture, the worse it will be.


Why SOM must be told slowly, mainly because the idea of ​​SOM has gone through three stages:

  Kohonen rule   -->   single sample training   -->   batch sample training  

If you want to directly talk about batch sample training, you can't talk about it at all.

With this in mind, I hope that readers will not be too quick and take it step by step.


 

1. Explanation of the entry principle


  01. Clustering algorithm based on Kohonen rule  


  clustering problem  


Oral description: Assuming that the data is clustered, we hope to find the center point (cluster center) of these clusters of data, and which cluster center the sample is closest to, then judge the sample as the cluster center.

   Clustering Method Based on Kohonen's Rules  


The kohonen rule clustering is very simple,

first randomly initialize k cluster center points,

and then select a sample each time, move the nearest cluster point to it, so that the cluster point is closer to it, and repeat m times .


The update rule is as follows:

w_{k} = w_k+\text{lr}*(x-w_k)

Among them,                                       
w_k : the cluster center point closest to the sample.
\text{lr}  : The learning rate.                             

   Validity of the kohonen rule  


The kohonen rule is simple, yet it works.

And look at a Demo:


There are four clusters of data in the plane,
we first randomly initialize 5 cluster center points,
and then use the Kohonen rule to adjust the position of the cluster center points,

 
It can be seen that after a certain number of steps, the cluster center point moves to the center of the four types of data.

Demo code:


% Kohonen聚类规则
rand('seed',70);
%------------生成样本数据-------------
dataC = [2.5,2.5;7.5,2.5;2.5,7.5;7.5,7.5]; % 生成四个样本中心
sn = 40;  % 样本个数
X = rand(sn,2)+dataC(mod(1:sn,4)+1,:); % 随机生成样本点


% -----------初始化聚类中心点--------------
kn = 5;              % 聚类中心点个数
C  = rand(kn,2)*10;  % 随机生成聚类中心
C0 = C;              % 备份聚类中心点的初始值


% -----------使用样本训练聚类中心点-----------
lr = 0.1;   % 学习率
for t = 1:50
    for i = 1:sn
        cur_x    = X(i,:);                             % 提取一个样本
        dist     = sum((repmat(cur_x,kn,1) - C).^2,2); % 计算样本到各个聚类中心点的距离
        [~,idx]  = min(dist);                          % 找出最近的聚类中心点
        C(idx,:) = C(idx,:)  + lr*(cur_x - C(idx,:));  % 将该聚类中心点往样本靠近
    end
end


% ----------画图------------------------
subplot(1,2,1)
plot(X(:,1),X(:,2),'*');
hold on 
plot(C0(:,1),C0(:,2),'or','MarkerFaceColor','g');


subplot(1,2,2)
plot(X(:,1),X(:,2),'*');
hold on 
plot(C(:,1),C(:,2),'or','MarkerFaceColor','g');

  02. The idea of ​​SOM clustering  


SOM is a modification of Kohonen's rule,

When it updates the cluster center point P closest to the sample, it will also update the adjacent cluster center points of P.

Please note that it is easy for beginners to misunderstand that the adjacent cluster points referred to by SOM are the cluster points near the target cluster point. In fact, it is not. SOM has its own definition of "adjacent cluster points".

  The distance of SOM clustering points and adjacent clustering points  


SOM first introduces a topological structure, connects all cluster points together, and then uses this to define the distance.

Topology


The topology can be one-dimensional, two-dimensional, three-dimensional, etc., and the most commonly used is two-dimensional.
For example, commonly used two-dimensional hexagonal topology:




Definition of distance



In SOM, the distance between two points
refers to the minimum number of edges between these two points in the introduced topology.


Neighboring cluster points


The adjacent clustering points of point P refer to the clustering points whose minimum number of connection edges with P is less than a certain threshold.
For example,
when the neighborhood distance threshold is 1, the neighboring cluster points of point P are the points directly connected with point P.    
When the neighborhood distance threshold is 2, it is a clustering point whose arrival point P does not exceed 2 edges.                
When the neighborhood distance threshold is k, it refers to the clustering points that can reach the point P through m (m<=k) edges.

  How to update SOM  


The SOM update method is the same as the Kohonen rule mentioned above.
The difference is that when SOM updates the cluster center point P closest to the sample, it will also update the adjacent cluster center points of P.

    

In more detail, there are three points as follows:


1. Update the adjacent cluster points:
 
while updating the nearest point P of the sample, the adjacent cluster points of P are also updated together (the learning rate of P is larger than that of the adjacent cluster points).
  
  2. Increase the shrinkage mechanism of the learning rate:
As the number of steps is updated, the learning rate becomes smaller and smaller.
 
  3. Proximity distance shrinkage mechanism:
With the number of update steps, the proximity distance threshold becomes smaller and smaller, and gradually, there are only target points and their neighboring cluster points. 

Compared with the pure Kohonen rule, although the changes are small, the code writing is much more complicated.
The reason for the complexity is to initialize the topology and obtain the distance matrix between points (the distance mentioned here is the number of edges mentioned above), so as to obtain the adjacent cluster points when updating.

   illustrate  


● The update method above comes from the single-sample training algorithm (learnsom) of the old version of matlab.
● The new version of matlab has adopted the batch update algorithm (learnsomb).             
For the details of the two methods, we will write a separate article to explain in detail, and dig out the source code to reproduce the implementation logic of matlab.        

      

 

  03. Topology diagram of SOM neural network  


  Network Topology  


SOM neural network is a typical three-layer neural network,
the topological diagram is as follows:
 


The first layer is the input layer and
the second layer is the hidden layer.

The number of hidden nodes in the hidden layer represents the number of cluster centers (the position of the cluster centers is the connection weight between the Cain node and the input).
The third layer is the output layer.
The output layer is in one-hot format (that is, the format [0 0 0 1]),
its number of nodes is the same as that of the hidden layer nodes, and
its value is obtained from the competition of hidden layer nodes, that is, hidden layer nodes Which value of the layer node is the largest, the corresponding output node is 1, and the rest are 0.

  Network topology diagram with hidden layer topology  


The topological structure between hidden layer nodes is often drawn together,
 
then the network topology diagram of SOM will be as follows:
 


PASS: The topology between the output nodes has no effect on the application of the final model, it only needs to be used during the training process.

  04. Model expression of SOM  


The model mathematical expression of SOM is:


\text{y} = \textbf{compet}(-\textbf{dist}(x,W))

in,

● dist is the Euclidean distance between x and W


For example, when 2 outputs 3 hidden nodes, x=[x_1,x_2] , W = \begin{bmatrix} w_{11} & w_{12} \\ w_{21} & w_{22} \\ w_{31} & w_{32} \end{bmatrix}

then:

\displaystyle \textbf{dist}(x,W) = \begin{bmatrix} \sqrt{(x_1- w_{11})^2+(x_1- w_{12})^2} \\ \\ \sqrt{(x_1- w_{21})^2+(x_1- w_{22})^2}\\ \\ \sqrt{(x_1- w_{31})^2+(x_1- w_{32})^2} \end{bmatrix}

● compete is the competition function,

It sets the maximum value of the vector to 1, and its value to 0.
For example, compet([ 2 5 3 ]) = [ 0 1 0 ]  

The calculation of the output of the SOM model, in simple terms, is the line where x is closest to W, which is 1, and the rest are 0.

The meaning behind it is that it is judged as which cluster point is closest to which cluster center point.


  Afterword  

In this article, we first roughly figure out what the SOM neural network is.
Its idea is not complicated. It is just based on Kohonen, which introduces a topology structure to the hidden nodes to define the neighborhood
. Since is basically hidden nodes. It is easy to misunderstand the network topology diagram of the topology structure, thinking that the hidden layer nodes are connected to each other.
In fact, the topology diagram of hidden nodes is only used to obtain neighborhood nodes during the training phase, and has nothing to do with the final model.
In the next article, after we implement the SOM code according to the internal logic of matlab, we will be more clear about the specific details and algorithm flow of the SOM algorithm.

2. SOM-code rewriting (single sample training)


This article is the source code of the author's fine-grained matlab2009b neural network toolbox newsom,

On the basis of the source code, the redundant code is removed, and the simplified version of the newsom code is reproduced. The code is exactly the same as the result of newsom.
Through the study of this code, you can fully understand the implementation logic of SOM single-sample training.

  01. Code Structure Description  

The code mainly contains three functions:    testSomNet     trainSomNet     predictSomNet      

testSomNet:   The main function of the test case, which is executed when running directly.


1. Data generation: Randomly generate a set of training data.
2. Use self-written functions to train a SOM network and predict results.
3. Use the toolbox to train a SOM network.
4. Compare whether the self-written function is consistent with the toolbox training results (comparison of weight and training error)

trainSomNet: Network training main function, used to train a SOM neural network.


Single-sample training method, training a SOM neural network

predictSomNet: Use the trained network to make predictions.


Pass in the X to be predicted and the weight matrix of the network to get the prediction result.

02. Interpretation of code running results

After running the code, the prediction results and comparison results are obtained, as follows:


 

 

It can be seen from this that the self-written code is consistent with the logic of the toolbox.

03. Specific code

The matlab2009b pro-test has run through:


%------------测试DEMO函数------------------
function testSomNet()
%本代码来自bp.bbbdata.com
%本代码模仿matlab神经网络工具箱的newsom神经网络,用于训练《SOM神经网络》,
%本代码扒自matlab2009b,使用的是旧版newsom单样本训练算法,在新版matlab中不再使用。
%代码主旨用于教学,供大家学习理解newsom神经网络原理
% ---------数据生成与参数预设-------------
% 数据生成
rand('seed',70);
X = [rand(1,400)*2; rand(1,400)];    % 生成样本
test_x = [0.5 0.6;0.5 0.6];                  % 测试样本
epochs = 10;                         % 训练步数
dimensions = [4 3];                  % 输出节点拓扑维度

%---------调用自写函数进行训练--------------
rand('seed',70);
w = trainSomNet(X,dimensions,epochs);
py = find(predictSomNet(w,test_x))

% -----调用工具箱,与工具箱的结果比较------
% 调用工具箱进行训练
rand('seed',70);
Xr = [min(X,[],2),max(X,[],2)];
net = newsom(Xr,dimensions);
net.trainParam.epochs = epochs;
net = train(net,X);

% 工具箱的结果
pyByTool = find(sim(net,test_x))
w_tools  = net.IW{1};

% 与工具箱的差异
maxECompareNet = max([max(abs(w(:)-w_tools(:))),max(abs(pyByTool(:)-py(:)))]);
disp(['自写代码与工具箱权重阈值的最大差异:',num2str(maxECompareNet)])

end

% -----------SOM的训练函数----------------------
function w = trainSomNet(X,dimensions,epochs)
[xn,sn] = size(X);                % 输入个数,样本个数
hn      = prod(dimensions);       % 隐节点个数

% ----生成隐节点拓扑结构并计算矩阵矩阵-----------
pos     = hextop(dimensions);     % 生成隐节点拓扑结构
d       = linkdist(pos);          % 隐节点拓扑结构距离矩阵

% --------参数设置--------------
order_steps = 1000;               % 收缩步数阈值
order_lr    = 0.9;                % 初始学习率
tune_lr     = 0.02;               % 学习率收缩阈值
nd_max      = max(max(d));        % 初始邻域距离
tune_nd     = 1 ;                 % 邻域距离收缩阈值

%-----初始化w:取每个输入的中心点-------------------
x_mid  = (min(X,[],2)+max(X,[],2))/2;         % 计算输入的中心点
w      = repmat(x_mid',hn,1);                 % 初始化w

% ---------训练-----------------------------
step   = 0;
for epoch=1:epochs
    for i=1:sn
        idx   = fix(rand*sn) + 1;         % 随机选择一个样本
        cur_x = X(:,idx);                 % 当前选择的样本

        if (step < order_steps)             % 小于order_steps时,线性收缩学习率与邻域
            percent = 1 - step/order_steps;
            nd      = 1.00001 + (nd_max-1) * percent;
            lr      = tune_lr + (order_lr-tune_lr) * percent;
        else                                % >=order_steps时,幂收缩学习率,邻域则不再变化
            nd      = tune_nd + 0.00001;
            lr      = tune_lr * order_steps/step;
        end
        
        a     = predictSomNet(w,cur_x);          % 网络的预测值
        lr_a = lr * 0.5*(a + (d < nd)*a);        % 计算邻域内的节点学习率
       
        % 计算dw
        dw   = zeros(hn,xn);                 
        dw   = dw +repmat(lr_a,1,xn) .* (repmat(cur_x',hn,1)-w);
        
        % 更新w
        w    = w + dw;  
        step = step + 1;
    end
end

end

% --------SOM的预测函数---------------
function y = predictSomNet(w,X)

% 计算隐节点激活值
z = zeros(size(w,1),size(X,2));
for i= 1: size(X,2)
    cur_x = X(:,i);
    z(:,i) = -sum((repmat(cur_x',size(w,1),1)-w).^ 2,2) .^ 0.5;
end
% 通过隐节点竞争得到输出
[~,idx] = max(z);% 找出最大值
y = z*0;
y(idx+ size(y,1)*(0:size(y,2)-1)) = 1;

end


Note: This code is the logic of the som code of the neural network toolbox of the old version of matlab, and the results of the new version will be inconsistent with newsom.


 

related articles

​"BP Neural Network Gradient Derivation"

​​​​​​"Mathematical expressions extracted by BP neural network"

"A Complete Modeling Process of BP"

Guess you like

Origin blog.csdn.net/dbat2015/article/details/125469975