Original address : http://blog.csdn.net/hjimce/article/details/45244603
The Gaussian mixture algorithm is a typical application of the EM algorithm. The derivation process of the EM algorithm is not going to be explained in detail here, but the implementation of the GMM algorithm is directly discussed. When I did the image segmentation grab cut algorithm before, I only knew to copy the Gaussian mixture model code in opencv , and then encapsulate it into a class for use, which is relatively shallow. As a result, within a few days, I found that the Gaussian mixture algorithm was almost forgotten, so I used matlab to write it myself, and finally found the meaning of the Gaussian mixture model. My understanding is that the Gaussian mixture model is actually an evolutionary version of the k -means algorithm, so to learn the Gaussian mixture model, it is best to write the k -means algorithm again. The essential difference between Gaussian mixture and k -means lies in the problem of weights. K -means uses uniform weights, while the weights of Gaussian mixtures need to be determined according to the probability of the Gaussian model.
To start learning the Gaussian mixture model, you need to briefly review the parameter estimation method of the single Gaussian model. To describe a Gaussian model is to calculate its mean and covariance matrix (one-dimensional space is the variance, and two-dimensional space is called the covariance matrix. ):
Assuming that there is a data set X={x1,x2,x3...,xn}, then the calculation formula for using these data to estimate the parameters of the single Gaussian model is:
、。
Before starting to write code, use matlab to generate a dataset, and then perform clustering. Use matlab to generate Gaussian model dataset X:
mu = [2 3];
SIGMA = [1 0; 0 2];
r1 = mvnrnd(mu,SIGMA,1000);
plot(r1(:,1),r1(:,2),'r+');
Then use the above estimation method to calculate the mean, and whether the covariance satisfies the mean is [2 3], and the covariance is [1 0; 0 2]; the test code is as follows, r2, covmat are the calculation results :
[m n]=size(r1); center=sum(r1)./m; r2(:,1)=r1(:,1)-center(1); r2(:,2)=r1(:,2)-center(2); covmat=1/m*r2'*r2;
Write the function of the single Gaussian model first, because the Gaussian mixture model is its evolutionary version, and the parameter estimation of the single Gaussian model needs to be called in the process of calculating the Gaussian mixture model. After writing the code, it will not be messed up. Before starting Gaussian mixture modeling, I used matlab to generate a test data set data, as shown in the figure below, and then performed the algorithm test.
The code to generate the dataset is as follows:
% Generate test data mu = [2 3];% test data 1 SIGMA = [1 0; 0 2]; r1 = mvnrnd(mu,SIGMA,100); plot(r1(:,1),r1(:,2),'.'); hold on; mu = [10 10];% test data 2 SIGMA = [ 1 0; 0 2]; r2 = mvnrnd (in, SIGMA, 100); plot (r2 (:, 1), r2 (:, 2), '.'); mu = [5 8];% test data 3 SIGMA = [ 1 0; 0 2]; r3= mvnrnd(mu,SIGMA,100); plot(r3(:,1),r3(:,2),'.'); data=[r1;r2;r3];
After the data is generated, we officially start the analysis of the Gaussian mixture algorithm. Let's take a look at the modeling parameters of the Gaussian mixture model:
The solution of the Gaussian mixture model is simply to solve the mean and covariance in the Gaussian model. Now we want to divide the above data into three categories, then we need to solve the three means and their corresponding three covariance matrices . Let’s talk about the overall steps first. The Gaussian mixture model consists of 3 steps:
a. Initialize the parameters of each Gaussian model and the weight of each Gaussian model;
b. According to the parameters of each Gaussian model and its weight, calculate the weight of each point belonging to each Gaussian model. The calculation formula is:
in:
W j is the weight that each Gaussian model occupies in this model. The simple point of this formula is the product of the weight of each Gaussian model and its probability, so the calculation is equivalent to the proportion of each Gaussian model in each data point.
c. Update the mean and variance of each Gaussian model, the calculation formula is as follows:
d. Update the total weight of each Gaussian model, the calculation formula is as follows:
In fact, the two steps c and d do not matter in order, you can completely update the total weight before each model parameter update. The iterative process is to update in three steps of b, c, and d. OK, then write the code in combination with the above formula.
(1) Initialize Gaussian model parameters
In this step of initialization, in practical applications, the k-means algorithm is generally used for initial clustering, and then the initialization parameters are calculated according to the clustering results. But here, for the test, we choose random initialization, so that we can see whether the GMM algorithm can achieve clustering.
The initialization method of the initial mean (center) of each Gaussian model here is the same as the initialization method of k-means, that is, k point positions are randomly selected as the initial mean of k Gaussian models. Then for the initialization of the covariance matrix, I choose the identity matrix, and the specific code is as follows:
[m n]=size(data); kn=3; countflag=zeros(1,kn); tdata=cell(1,kn);% create 3 empty matrices mu=cell(1,kn);% create 3 empty matrices sigma=cell(1,kn);% create 3 empty matrices % Scheme 2 Random initialization parameters for i=1:kn mu{1,i}=data(i*10,:); sigma{1,i}=eye(2,2); weightp(i)=1/kn; end
(2) Calculate the weight value of each model at each point
This step is to calculate the probability that each data point belongs to each Gaussian mixture, which is to calculate the weight:
pro_ij=zeros(m,kn);% store the probability that each point belongs to each class for i=1:m sumpk=0; for j=1:kn pk(j)=weightp(j)*GSMPro(mu{1,j},sigma{1,j},data(i,:)); sumpk=sumpk+pk(j); end for j=1:kn pro_ij(i,j)=pk(j)/sumpk; end end
(3) 步骤c 更新参数:
for j=1:kn [mu{1,j},sigma{1,j}]=WeightGSM(data,pro_ij(:,j)); end
(4)步骤d 更新各个模型的总权重:
for j=1:kn weightp(j)=sum(pro_ij(:,j))/m; end
然后把步骤2、3、4的代码放在循环语句中进行迭代就ok了。最后贴一下整份代码:
1、脚本文件:
close all; clear; clc; %生成测试数据 mu = [2 3];%测试数据1 SIGMA = [1 0; 0 2]; r1 = mvnrnd(mu,SIGMA,100); plot(r1(:,1),r1(:,2),'.'); hold on; mu = [10 10];%测试数据2 SIGMA = [ 1 0; 0 2]; r2 = mvnrnd(mu,SIGMA,100); plot(r2(:,1),r2(:,2),'.'); mu = [5 8];%测试数据3 SIGMA = [ 1 0; 0 2]; r3= mvnrnd(mu,SIGMA,100); plot(r3(:,1),r3(:,2),'.'); data=[r1;r2;r3]; [m n]=size(data); kn=3; countflag=zeros(1,kn); tdata=cell(1,kn);%建立10个空矩阵 mu=cell(1,kn);%建立10个空矩阵 sigma=cell(1,kn);%建立10个空矩阵 % 方案1 初始化采用kmeans,做参数的初步估计 % Idx=kmeans(data,kn); % figure(2);%绘制初始化结果 % hold on; % for i=1:m % if Idx(i)==1 % plot(data(i,1),data(i,2),'.y'); % elseif Idx(i)==2 % plot(data(i,1),data(i,2),'.b'); % end % end % for i=1:m % tdata{1,Idx(i)}=[tdata{1,Idx(i)};data(i,:)]; % end % for i=1:kn % [mu{1,i},sigma{1,i}]=GSMData(tdata{1,i}); % end % for i=1:kn % [trow,tcol]=size(tdata{1,i}); % weightp(i)=trow/m; % end %方案2 随机初始化 for i=1:kn mu{1,i}=data(i*10,:); sigma{1,i}=eye(2,2); weightp(i)=1/kn; end it=1; while it<1000 %E步 计算每个点处于每个类的概率 pro_ij=zeros(m,kn);%存储每个点属于每个类的概率 for i=1:m sumpk=0; for j=1:kn pk(j)=weightp(j)*GSMPro(mu{1,j},sigma{1,j},data(i,:)); sumpk=sumpk+pk(j); end for j=1:kn pro_ij(i,j)=pk(j)/sumpk; end end %M步 for j=1:kn [mu{1,j},sigma{1,j}]=WeightGSM(data,pro_ij(:,j)); end %更新权值 for j=1:kn weightp(j)=sum(pro_ij(:,j))/m; end sumw=sum(weightp); it=it+1; end for i=1:m [value index]=max(pro_ij(i,:)); Idx(i)=index; end figure(2); hold on; for i=1:m if Idx(i)==1 plot(data(i,1),data(i,2),'.y'); elseif Idx(i)==2 plot(data(i,1),data(i,2),'.b'); elseif Idx(i)==3 plot(data(i,1),data(i,2),'.r'); end end % figure(3); % %px=gmmstd(data,3); % for i=1:m % [value index]=max(px(i,:)); % Idx(i)=index; % end % hold on; % for i=1:m % if Idx(i)==1 % plot(data(i,1),data(i,2),'.y'); % elseif Idx(i)==2 % plot(data(i,1),data(i,2),'.b'); % elseif Idx(i)==3 % plot(data(i,1),data(i,2),'.r'); % end % end %单高斯模型参数估计 % [m n]=size(r1); % center=sum(r1)./m; % r2(:,1)=r1(:,1)-center(1); % r2(:,2)=r1(:,2)-center(2); % covmat=1/m*r2'*r2;
2、相关函数
function [ mu ,sigma ] = WeightGSM(data,weight) %计算加权均值 [m n]=size(data); sumweight=sum(weight); weightdata=[]; for i=1:m weightdata(i,:)=weight(i)*data(i,:); end center=sum(weightdata)/sumweight; %计算加权协方差 for i=1:n r2(:,i)=data(:,i)-center(i); end for i=1:m r1(i,:)=weight(i)*r2(i,:); end sigma=1/sumweight*r1'*r2; mu=center; end function [pro] = GSMPro(mu ,sigma,x) pro=exp(-0.5*(x-mu)*inv(sigma)*(x-mu)'); pro=1/sqrt(2*pi*det(sigma))*pro; end
看以下最后的测试结果: