Regression analysis and visualization based on caffe and MATLAB interface

-----------------Changelog-----------------

20:33 2017/1/28 Provide a method for MATLAB to generate loss curve

-----------------Changelog-----------------

If you encounter some problems, you can find out if there is a solution here . The content of this paper is mainly divided into two parts. The first part introduces the regression analysis based on caffe, including data preparation, configuration files, etc.; the second part introduces the visualization on MATLAB. (In other words, I recently have a topic that needs to be classified, and if you are interested, you can discuss it together).

Preparation

Pre-installed caffe on windows, and successfully compiled the MATLAB interface.

Regression analysis with caffe

Regression analysis is performed through caffe, and the experiment is mainly divided into HDF5 data preparation, network design, training, and testing. This experiment has been done by netizens, you can refer to: http://www.cnblogs.com/frombeijingwithlove/p/5314042.html

Or check out the reproduced copy (http://blog.csdn.net/xiaoy_h/article/details/54731122). But the difference is that some necessary links in the experiment are realized by MATLAB instead of Python . Only the different ones are described below.

Prepare HDF5 data via MATLAB

MATLAB is a serious MATLAB. For example, your data contains 35 input variables, a total of 10,000 samples; first create a variable in MATLAB, copy the data (don't forget to do scaling in between), and you will get a 10,000x 35 variables, of course, the label variable is 10000 x 1; finally get train_x (10000 x 35), train_y (10000 x 1), test_x (600 x 35), test_y (600 x 1) a total of four variables, put them Save to a mat file, such as data.mat file.

Then, the conversion of h5 data is realized by the following script (the script is modified from demo.m in the CAFFE_ROOT\matlab\hdf5creation\ directory):

%% WRITING TO HDF5
filename           =   'train.h5';
load 'data.mat';
num_total_samples  =   size(train_x,1);            %获取样本总数
data_disk          =   train_x';
label_disk         =   train_y';
chunksz            =   100;                        %数据块大小
created_flag       =   false;
totalct            =   0;
for batchno=1:num_total_samples/chunksz
   fprintf('batch no. %d\n', batchno);
   last_read       =   (batchno-1)*chunksz;
   % 在dump到hdf5文件之前模拟内存中放置的最大数据量
   batchdata       =   data_disk(:,last_read+1:last_read+chunksz);
   batchlabs       =   label_disk(:,last_read+1:last_read+chunksz);
   % 存入hdf5
   startloc        =   struct('dat',[1,totalct+1], 'lab',[1,totalct+1]);
   curr_dat_sz     =   store2hdf5(filename, batchdata,batchlabs, ~created_flag, startloc, chunksz);
   created_flag    =   true;                       %设置flag为了只创建一次文件
   totalct         =   curr_dat_sz(end);           %更新数据集大小 (以样本数度量)
end
%显示所存储HDF5文件的结构体
h5disp(filename);
%创建HDF5_DATA_LAYER中使用的list文件
FILE               =   fopen('train.txt', 'w');
fprintf(FILE, '%s', filename);
fclose(FILE);
fprintf('HDF5 filename listed in %s \n','train.txt');

The above is the h5 format generation method of the training set, and a script for the test set is done. At this point, four files will be obtained, namely train.h5, test.h5, train.txt, test.txt. The first two are the dataset files, and the last two are the path description files of the dataset files. If each phase If there are multiple datasets, just list them in one path description file.

web design

Network design is one of the keys to whether a useful model can be obtained. You can refer to some materials and design reasonably according to the characteristics of your own data. This article uses the following simple network to introduce:

 

train and test

Some of the main steps are described in the previously mentioned blog, here is a trick - how to get the loss curve during training.

First, find the common.cpp file in the CAFFE_ROOT\src folder and modify the contents:

1. Add the header file #include<direct.h>

2. Locate the voidGloballnit function and add the following code in the next line of ::google::InitGoogleLogging(*(pargv)[0]);:

_mkdir("./log/");
FLAGS_colorlogtostderr =true;                               //设置输出到屏幕的日志显示相应颜色
//如下为设置各类日志信息对应日志文件的文件名前缀
google::SetLogDestination(google::GLOG_FATAL,"./log/log_error_");
google::SetLogDestination(google::GLOG_ERROR,"./log/log_error_");
google::SetLogDestination(google::GLOG_WARNING,"./log/log_error_");
google::SetLogDestination(google::GLOG_INFO,"./log/log_info_");
FLAGS_max_log_size = 1024;                                  //最大日志大小为 1024MB
FLAGS_stop_logging_if_full_disk =true;                      //磁盘写满时,停止日志输出

Don't forget to save and recompile caffe.exe after the modification is completed. The training log can be found under CAFFE_ROOT\log in subsequent training.

After obtaining the training log, for example, the file name is log_info_20170116-163216.9556, there are two ways to get the loss curve graph, one is supported by caffe itself but must have a Python environment, and the other is implemented by using regular expressions in MATLAB (without Python Blessed are the students in the environment). They are as follows:

(1) Copy the log to the CAFFE_ROOT\tools\extra directory (you can see the py program in the same directory), then open the command line in this directory and run the command:

md logconv
set logfile=log_info_20170116-163216.9556
python parse_log.py %logfile% ./logconv
ren logconv\%logfile%.train %logfile%.csv
pause

Then the py program will convert the log to a csv file, and the loss curve graph can be drawn in the excel sheet:

(2) Extracting through the following code can be saved as a parse_log.m file. The regular expression syntax is basically the same as that of Python. It has been slightly modified. The function is somewhat castrated compared to the one provided by caffe. After all, some are temporarily unavailable.

function []= parse_log(filename)
regex_iteration_str = 'Iteration (\d+)';
regex_train_output_str = 'Train net output #(\d+): (\S+) = ([.\deE+-]+)';
FILE	=   fopen(filename);
FOUT	=   fopen(sprintf('%s.csv',filename),'w');
fprintf(FOUT,'iter,accuracy,loss\n');
plotdata=[];
while ~feof(FILE)
    aline   =   fgetl(FILE);
    rescell     =   regexpi(aline,regex_iteration_str,'tokens');
    if size(rescell,1)~=0;
        iteration   =   int32(str2num(rescell{1,1}{1,1}));
    end
    rescell     =   regexpi(aline,regex_train_output_str,'tokens');
    if size(rescell,1)~=0;
        if str2num(rescell{1,1}{1,1})==0
            accuracy    =   str2num(rescell{1,1}{1,3});
        end
        if str2num(rescell{1,1}{1,1})==1
            loss    =   str2num(rescell{1,1}{1,3});
            plotdata(size(plotdata,1)+1,1)   =	iteration;
            plotdata(size(plotdata,1),2:3)   =	[accuracy loss];
            fprintf(FOUT,'%d,%f,%f\n',iteration,accuracy,loss);
        end
    end
end
fclose(FILE);
fclose(FOUT);
figure,plot(plotdata(:,1),plotdata(:,2),'LineSmoothing','on');
xlabel('iter'),ylabel('accuracy');
figure,plot(plotdata(:,1),plotdata(:,3),'LineSmoothing','on');
xlabel('iter'),ylabel('loss');
end

Finally, the program will directly generate a curve graph, the style of which is probably as follows; of course, the generated csv file can be used for secondary creation of drawing in excel.

MATLAB-based model visualization

The visualization of the network model can be divided into weight visualization (some people think that the offset is not important, but the offset is sometimes very important) and the feature map visualization. Different visualization styles can be designed according to the needs of their own applications.

In the experiment, the visualization steps are divided into deploy file writing, reading the trained model file through the MATLAB interface, outputting the weight visualization map and input samples according to their own needs, and generating the feature map.

deploy file writing

There are already some ways to get deploy files through scripts: http://blog.csdn.net/lanxuecc/article/details/52474476 , but it seems to be quite troublesome to use, instead of directly entering the train.prototxt file by inputting realized. This method is sometimes not directly handwritten and simple and rude: http://blog.csdn.net/sunshine_in_moon/article/details/49472901

From the content of this blog, we can know that the preparation of the deploy file is different from that of train_test.prototxt. The points of attention are as follows:

1. The construction of the *_deploy.prototxt file does not have the test module in the test network, only the training module

For example, there cannot be a description such as the following:

include{  phase:TRAIN/TEST  }

2. The writing method of the data layer is more concise:

input: "data"
input_dim: 1
input_dim: 3
input_dim: 32
input_dim: 32

Note the red part, that is the name of the data layer. Without this, the first convolutional layer cannot find the data; the next four parameters are the dimensions of the input samples.

3. The two parameters weight_filler{} and bias_filler{} in the convolutional layer and the fully connected layer do not need to be filled in, because the values ​​of these two parameters are provided by the trained model caffemodel file.

4. The change of the output layer does not have the test accuracy of the test module. The specific output layer differences are as follows:

*_train_test.prototxt文件:
layer{
  name: "loss"
  type: "SoftmaxWithLoss" #注意此处与下面的不同
  bottom: "ip2"
  bottom: "label" #注意标签项在下面没有了,因为下面的预测属于哪个标签,因此不能提供标签
  top: "loss"
}
*_deploy.prototxt文件:
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}

The types of output layers in the two files are different, one is SoftmaxWithLoss and the other is Softmax. In addition, in order to facilitate the distinction between training and application output, the output is loss during training, and prob during application (of course, it can be called another name).

If the program crashes directly when executing this line (net = caffe.Net(model,'test');) MATLAB code, it is because the content of the layer in the deploy file is not written correctly; and if this line ( net.copy_from(weights);) The program crashes directly, it may be because the parameters of the input and output dimensions of the neurons in the deploy file are not written correctly, for example, in the data layer, if input_dim: 3 is written as input_dim: 1, an error will occur .

Finally, you can refer to the comparison of each layer before and after a network modification (the deploy file on the right):

Note that I omitted the prob layer (a softmax layer) here. Sometimes it is a burden to analyze the network model to have this layer, which can be removed depending on the specific situation; at the same time, since the softmax layer does not have parameters such as weights, it is imported The program does not crash when in MATLAB.

Read the trained model file through the MATLAB interface

The following code implements test sample import, network import, and weight import, etc.:

load 'data.mat';                                        %加载样本数据
caffe.set_mode_cpu();                                   %设置为CPU模式
model       =   'deploy.prototxt';                      %模型-网络
weights    =   'snapshot/_iter_10000.caffemodel';       %模型-权值
net            =   caffe.Net(model,'test');
net.copy_from(weights);                                 %得到训练好的权重参数

 

Output weight visualization graph according to your own needs

The import of parameters such as weights has been completed above. Next, the layer names will be obtained first, and then the weights will be visualized layer by layer. Without further ado, just go to the code:

layernames =   net.layer_names;                     %获取所有层名
magnifys   =   3;                                   %卷积核放大倍数,用于可视化放大
for li =   1:size(layernames,1)
%    disp(layernames{li});
   if strcmp(layernames{li}(1:2),'ip') ||strcmp(layernames{li}(1:2),'fc')  %fc层
    %还没想好怎么plot
   elseif strcmp(layernames{li}(1:4),'data')        %数据层,跳过       
   elseif strcmp(layernames{li}(1:4),'conv')        %卷积层
       weight  =        net.layers(layernames{li}).params(1).get_data();
       bias    =  net.layers(layernames{li}).params(2).get_data();
       %归一化
       weight  =   weight-min(weight(:));
       weight  =   weight./max(weight(:));
       [ker_h,ker_w,prev_outnum,this_outnum]  =   size(weight);
       [grid_h,grid_w]     =   getgridsize(prev_outnum);
        for thisi=1:this_outnum
           weight_map  =  zeros(magnifys*ker_h*grid_h+grid_h,magnifys*ker_w*grid_w+grid_w);
           for gridhi=1:grid_h
                for gridwi=1:grid_w
                    weight_map((gridhi-1)*(magnifys*ker_h+1)+1:gridhi*(magnifys*ker_h+1)-1,(gridwi-1)*(magnifys*ker_w+1)+1:gridwi*(magnifys*ker_w+1)-1)=imresize(weight(:,:,(gridhi-1)*grid_w+gridwi,thisi),magnifys,'nearest');
                end
           end
           imwrite(uint8(weight_map(1:end-1,1:end-1)*255),sprintf('layer_%s_this_%d.bmp',layernames{li},thisi));
       end
   elseif strcmp(layernames{li}(1:4),'pool')        %池化层
    %无参,不需要plot       
   end
end

In the above code, note that getgridsize is used to determine the number of rows and columns of convolution kernels in the entire large image in the convolutional layer. This function can be used by itself.

Finally, since the regression analysis network needs to combine specific variables to interpret the weight map, we can only put the results of the visualization of the weights of the lenet model (only the conv layer, the fc layer needs to combine specific variables):

Input samples and generate feature maps

1. First randomly pass in a sample:

res = net.forward({test_x(1,:)});

Note that a cell is passed in, not a common matrix. After forward computation, feature maps are generated at each layer of the network.

2. As we all know above, the way to obtain the weight data is:

weight = net.layers(layernames{li}).params(1).get_data();

The acquisition of the feature map is similar, as follows:

featmap = net.blobs(layernames{li}).get_data();

Other codes and weight visualization are basically the same, and netizens can play freely.

Finally, since the regression analysis network needs to combine specific variables to explain the feature map, we can only put the results of the visualization of the feature map of the lenet model:

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324455033&siteId=291194637