Table of contents
In today's information age, data is constantly emerging, and data classification is an important issue in data mining and machine learning. Data classification algorithms aim to divide data points into different categories to facilitate data analysis, decision making and pattern recognition. This article will introduce in detail the data classification algorithm based on GRNN (General Regression Neural Network) network and FCM (Fuzzy C-Means), including mathematical principles, implementation process and application fields.
1. GRNN network
GRNN is a neural network model used to model the relationship between data. It has a wide range of applications in the fields of pattern recognition, regression analysis and classification. The mathematical principle of GRNN is as follows:
2. FCM algorithm
FCM is a fuzzy clustering algorithm used to classify data points into different fuzzy categories. It is widely used in cluster analysis, pattern recognition and image processing and other fields. The mathematical principle of FCM is as follows:
The implementation process of the data classification algorithm based on GRNN network and FCM mainly includes data preprocessing, GRNN network training, FCM clustering, data classification and other steps.
3. Implementation process
3.1. Data preprocessing
In the data classification algorithm, the original data needs to be preprocessed first, including steps such as feature selection, feature scaling, and data cleaning, so as to facilitate subsequent model training and classification.
3.2. GRNN network training
Input the preprocessed data into the GRNN network for training. During training, the similarity of each input to other inputs is calculated, and predictions are calculated based on the output of the GRNN.
3.3. FCM clustering
The output of the trained GRNN network is used as input, and the FCM algorithm is used to cluster the data. Determine the degree of membership u_{ij}uij and cluster center v_jvj, and divide the data points into different fuzzy categories.
3.4. Data Classification
Input the data points to be classified into the trained GRNN network to get the output value. Then, according to the FCM clustering results and the GRNN output, the category to which the data point belongs is determined.
4. Core program
%% 广义神经网络聚类
net = newgrnn(P2',T2,50); %训练广义网络
a2=sim(net,P1') ; %预测结果
%输出标准化(根据输出来分类)
a2(find(a2<=1.5))=1;
a2(find(a2>1.5&a2<=2.5))=2;
a2(find(a2>2.5&a2<=3.5))=3;
a2(find(a2>3.5&a2<=4.5))=4;
a2(find(a2>4.5))=5;
%% 网络训练数据再次提取
cent1=P1(find(a2==1),:);cent1=mean(cent1);
cent2=P1(find(a2==2),:);cent2=mean(cent2);
cent3=P1(find(a2==3),:);cent3=mean(cent3);
cent4=P1(find(a2==4),:);cent4=mean(cent4);
cent5=P1(find(a2==5),:);cent5=mean(cent5);
for n=1:R1%计算样本到各个中心的距离
ecent1(n)=norm(P1(n,:)-cent1);
ecent2(n)=norm(P1(n,:)-cent2);
ecent3(n)=norm(P1(n,:)-cent3);
ecent4(n)=norm(P1(n,:)-cent4);
ecent5(n)=norm(P1(n,:)-cent5);
end
%选择离每类中心最近的csum个样本
for n=1:csum
[va me1]=min(ecent1);
[va me2]=min(ecent2);
[va me3]=min(ecent3);
[va me4]=min(ecent4);
[va me5]=min(ecent5);
ecnt1(n,:)=P1(me1(1),:);ecent1(me1(1))=[];tc1(n)=1;
ecnt2(n,:)=P1(me2(1),:);ecent2(me2(1))=[];tc2(n)=2;
ecnt3(n,:)=P1(me3(1),:);ecent3(me3(1))=[];tc3(n)=3;
ecnt4(n,:)=P1(me4(1),:);ecent4(me4(1))=[];tc4(n)=4;
ecnt5(n,:)=P1(me5(1),:);ecent5(me5(1))=[];tc5(n)=5;
end
p2=[ecnt1;ecnt2;ecnt3;ecnt4;ecnt5];T2=[tc1,tc2,tc3,tc4,tc5];
%统计分类结果
Confusion_Matrix_GRNN=zeros(6,6);
Confusion_Matrix_GRNN(1,:)=[0:5];
Confusion_Matrix_GRNN(:,1)=[0:5]';
for nf=1:5
for nc=1:5
Confusion_Matrix_GRNN(nf+1,nc+1)=length(find(a2(find(T1==nf))==nc));
end
end
up2178