Machine Learning (4) Gaussian Mixture Model

Original address : http://blog.csdn.net/hjimce/article/details/45244603

The Gaussian mixture algorithm is a typical application of the EM algorithm. The derivation process of the EM algorithm is not going to be explained in detail here, but the implementation of the GMM algorithm is directly discussed. When I did the image segmentation grab cut  algorithm before, I only knew to copy the Gaussian mixture model code in opencv , and then encapsulate it into a class for use, which is relatively shallow. As a result, within a few days, I found that the Gaussian mixture algorithm was almost forgotten, so I used matlab to write it myself, and finally found the meaning of the Gaussian mixture model. My understanding is that the Gaussian mixture model is actually an evolutionary version of the k -means algorithm, so to learn the Gaussian mixture model, it is best to write the k -means algorithm again. The essential difference between Gaussian mixture and k -means lies in the problem of weights. K -means uses uniform weights, while the weights of Gaussian mixtures need to be determined according to the probability of the Gaussian model.

To start learning the Gaussian mixture model, you need to briefly review the parameter estimation method of the single Gaussian model. To describe a Gaussian model is to calculate its mean and covariance matrix (one-dimensional space is the variance, and two-dimensional space is called the covariance matrix. ):

Assuming that there is a data set X={x1,x2,x3...,xn}, then the calculation formula for using these data to estimate the parameters of the single Gaussian model is:

        

Before starting to write code, use matlab to generate a dataset, and then perform clustering. Use matlab to generate Gaussian model dataset X:

mu = [2 3];
SIGMA = [1 0; 0 2];
r1 = mvnrnd(mu,SIGMA,1000);
plot(r1(:,1),r1(:,2),'r+');

Then use the above estimation method to calculate the mean, and whether the covariance satisfies the mean is [2 3], and the covariance is [1 0; 0 2]; the test code is as follows, r2, covmat are the calculation results :

[m n]=size(r1);
center=sum(r1)./m;
r2(:,1)=r1(:,1)-center(1);
r2(:,2)=r1(:,2)-center(2);
covmat=1/m*r2'*r2;

Write the function of the single Gaussian model first, because the Gaussian mixture model is its evolutionary version, and the parameter estimation of the single Gaussian model needs to be called in the process of calculating the Gaussian mixture model. After writing the code, it will not be messed up. Before starting Gaussian mixture modeling, I used matlab to generate a test data set data, as shown in the figure below, and then performed the algorithm test.

The code to generate the dataset is as follows:

% Generate test data
mu = [2 3];% test data 1
SIGMA = [1 0; 0 2];
r1 = mvnrnd(mu,SIGMA,100);
plot(r1(:,1),r1(:,2),'.');
hold on;
mu = [10 10];% test data 2
SIGMA = [ 1 0; 0 2];
r2 = mvnrnd (in, SIGMA, 100);
plot (r2 (:, 1), r2 (:, 2), '.');
mu = [5 8];% test data 3
SIGMA = [ 1 0; 0 2];
r3= mvnrnd(mu,SIGMA,100);
plot(r3(:,1),r3(:,2),'.');
data=[r1;r2;r3];
        

After the data is generated, we officially start the analysis of the Gaussian mixture algorithm. Let's take a look at the modeling parameters of the Gaussian mixture model:

The solution of the Gaussian mixture model is simply to solve the mean and covariance in the Gaussian model. Now we want to divide the above data into three categories, then we need to solve the three means and their corresponding three covariance matrices . Let’s talk about the overall steps first. The Gaussian mixture model consists of 3 steps:

a. Initialize the parameters of each Gaussian model and the weight of each Gaussian model;

b. According to the parameters of each Gaussian model and its weight, calculate the weight of each point belonging to each Gaussian model. The calculation formula is:

        

in:

        

W j is the weight that each Gaussian model occupies in this model. The simple point of this formula is the product of the weight of each Gaussian model and its probability, so the calculation is equivalent to the proportion of each Gaussian model in each data point.

c. Update the mean and variance of each Gaussian model, the calculation formula is as follows:

        

        

d. Update the total weight of each Gaussian model, the calculation formula is as follows:

           

In fact, the two steps c and d do not matter in order, you can completely update the total weight before each model parameter update. The iterative process is to update in three steps of b, c, and d. OK, then write the code in combination with the above formula.

(1) Initialize Gaussian model parameters

In this step of initialization, in practical applications, the k-means algorithm is generally used for initial clustering, and then the initialization parameters are calculated according to the clustering results. But here, for the test, we choose random initialization, so that we can see whether the GMM algorithm can achieve clustering.

The initialization method of the initial mean (center) of each Gaussian model here is the same as the initialization method of k-means, that is, k point positions are randomly selected as the initial mean of k Gaussian models. Then for the initialization of the covariance matrix, I choose the identity matrix, and the specific code is as follows:

[m n]=size(data);
kn=3;
countflag=zeros(1,kn);
tdata=cell(1,kn);% create 3 empty matrices
mu=cell(1,kn);% create 3 empty matrices
sigma=cell(1,kn);% create 3 empty matrices
% Scheme 2 Random initialization parameters
for i=1:kn
    mu{1,i}=data(i*10,:);
    sigma{1,i}=eye(2,2);
    weightp(i)=1/kn;
end

(2) Calculate the weight value of each model at each point

This step is to calculate the probability that each data point belongs to each Gaussian mixture, which is to calculate the weight:

pro_ij=zeros(m,kn);% store the probability that each point belongs to each class
   for i=1:m
      sumpk=0;
      for j=1:kn
          pk(j)=weightp(j)*GSMPro(mu{1,j},sigma{1,j},data(i,:));
          sumpk=sumpk+pk(j);
      end
      for j=1:kn
          pro_ij(i,j)=pk(j)/sumpk;
     end
end

(3) 步骤c 更新参数

for j=1:kn
    [mu{1,j},sigma{1,j}]=WeightGSM(data,pro_ij(:,j)); 
end

(4)步骤d 更新各个模型的总权重

for j=1:kn
    weightp(j)=sum(pro_ij(:,j))/m;
end

然后把步骤2、3、4的代码放在循环语句中进行迭代就ok了。最后贴一下整份代码:

1、脚本文件:

close all;
clear;
clc;
%生成测试数据
mu = [2 3];%测试数据1
SIGMA = [1 0; 0 2];
r1 = mvnrnd(mu,SIGMA,100);
plot(r1(:,1),r1(:,2),'.');
hold on;
mu = [10 10];%测试数据2
SIGMA = [ 1 0; 0 2];
r2 = mvnrnd(mu,SIGMA,100);
plot(r2(:,1),r2(:,2),'.');

mu = [5 8];%测试数据3
SIGMA = [ 1 0; 0 2];
r3= mvnrnd(mu,SIGMA,100);
plot(r3(:,1),r3(:,2),'.');


data=[r1;r2;r3];

[m n]=size(data);
kn=3;
countflag=zeros(1,kn);
tdata=cell(1,kn);%建立10个空矩阵
mu=cell(1,kn);%建立10个空矩阵
sigma=cell(1,kn);%建立10个空矩阵
% 方案1 初始化采用kmeans,做参数的初步估计
% Idx=kmeans(data,kn);
% figure(2);%绘制初始化结果
% hold on;
% for i=1:m
%     if Idx(i)==1
%         plot(data(i,1),data(i,2),'.y');
%     elseif Idx(i)==2
%          plot(data(i,1),data(i,2),'.b');
%     end
% end
% for i=1:m
%    tdata{1,Idx(i)}=[tdata{1,Idx(i)};data(i,:)];
% end
% for i=1:kn
%     [mu{1,i},sigma{1,i}]=GSMData(tdata{1,i});
% end
% for i=1:kn
%     [trow,tcol]=size(tdata{1,i});
%     weightp(i)=trow/m;
% end
%方案2 随机初始化
for i=1:kn
    mu{1,i}=data(i*10,:);
    sigma{1,i}=eye(2,2);
    weightp(i)=1/kn;
end



it=1;

while it<1000
    %E步 计算每个点处于每个类的概率
    pro_ij=zeros(m,kn);%存储每个点属于每个类的概率
    for i=1:m
        sumpk=0;
        for j=1:kn
            pk(j)=weightp(j)*GSMPro(mu{1,j},sigma{1,j},data(i,:));
            sumpk=sumpk+pk(j);
        end
        for j=1:kn
            pro_ij(i,j)=pk(j)/sumpk;
        end
    end 
    %M步 
    for j=1:kn
        [mu{1,j},sigma{1,j}]=WeightGSM(data,pro_ij(:,j)); 
    end
    %更新权值
    for j=1:kn
        weightp(j)=sum(pro_ij(:,j))/m;
    end
    sumw=sum(weightp);
    it=it+1;
end
for i=1:m
    [value index]=max(pro_ij(i,:));
    Idx(i)=index;
end
figure(2);
hold on;
for i=1:m
    if Idx(i)==1
        plot(data(i,1),data(i,2),'.y');
    elseif Idx(i)==2
         plot(data(i,1),data(i,2),'.b');
    elseif Idx(i)==3
         plot(data(i,1),data(i,2),'.r');
    end
end


% figure(3);
% %px=gmmstd(data,3);
% for i=1:m
%     [value index]=max(px(i,:));
%     Idx(i)=index;
% end
% hold on;
% for i=1:m
%     if Idx(i)==1
%         plot(data(i,1),data(i,2),'.y');
%     elseif Idx(i)==2
%          plot(data(i,1),data(i,2),'.b');
%     elseif Idx(i)==3
%          plot(data(i,1),data(i,2),'.r');
%     end
% end
%单高斯模型参数估计
% [m n]=size(r1);
% center=sum(r1)./m;
% r2(:,1)=r1(:,1)-center(1);
% r2(:,2)=r1(:,2)-center(2);
% covmat=1/m*r2'*r2;

2、相关函数

function [ mu ,sigma ] = WeightGSM(data,weight)
    %计算加权均值
    [m n]=size(data);
    sumweight=sum(weight);
    weightdata=[];
    for i=1:m
        weightdata(i,:)=weight(i)*data(i,:);
    end
    center=sum(weightdata)/sumweight;
    %计算加权协方差
    for i=1:n
       r2(:,i)=data(:,i)-center(i);
    end
    for i=1:m
        r1(i,:)=weight(i)*r2(i,:);
    end
   
    sigma=1/sumweight*r1'*r2;
    mu=center;
end
function [pro] = GSMPro(mu ,sigma,x)
  pro=exp(-0.5*(x-mu)*inv(sigma)*(x-mu)');
  pro=1/sqrt(2*pi*det(sigma))*pro;
end

看以下最后的测试结果:

        

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325643415&siteId=291194637