SRCNN_小结

SRCNN Super-Resolution Convolutional Neural Network (超分辨率重建卷积神经网络)，网络模型如图所示：
这里写图片描述
该方法对于一个低分辨率图像，先使用双三次（bicubic）插值将其放大到目标大小，再通过三层卷积网络做非线性映射，得到的结果作为高分辨率图像输出。作者将三层卷积的结构解释成与传统SR方法对应的三个步骤：图像块的提取和特征表示，特征非线性映射和最终的重建。

三个卷积层使用的卷积核的大小分为为9x9, 1x1和5x5，前两个的输出特征个数分别为64和32. 该文章分别用Timofte数据集（包含91幅图像）和ImageNet大数据集进行训练。相比于双三次插值和传统的稀疏编码方法，SRCNN得到的高分辨率图像更加清晰。

对SR的质量进行定量评价常用的两个指标是PSNR(Peak Signal-to-Noise Ratio)和SSIM（Structure Similarity Index）。这两个值越高代表重建结果的像素值和原标准越接近。后面我们会知道：卷积神经网络的代价函数就是PSNR的取值。

实验环境：caffe , Ubuntu 14.04.5 LTS，实验代码。我们要先在官网上下载caffe的源代码。通过cmake进行编译。生成可执行文件。这里涉及很多c语言的知识，因为caffe是用c语言写的，我在这里还留下很多不知道的坑，嗯，希望我以后可以慢慢全部搞懂。
训练看了很多资料感觉caffe是一个非常友好的框架。打个比方：如果说机器学习是炼丹，那么只需要知道大的过程就可以用caffe了。我们在caffe的根目录下输入：该命令就可以训练我们的模型了。

./build/tools/caffe train --solver=examples/SRCNN/SRCNN_solver.prototxt

SRCNN文件夹的内容是我们从github上下载的实验源代码。（装好git,一个命令就可以了。）我是从window上的cmd命令去理解这个命令的。./build/tools/caffe = 相对路径 + 可执行文件名（其实我们花那么大功夫去编译caffe也就是得到这个caffe）.
train solver=examples/SRCNN/SRCNN_solver.prototxt ，相当于参数赋值。
这里写图片描述
上图是：训练过程的结果显示。解释一些参数：
Iteration 0 迭代次数为0次
lr = 0.0001 学习率。（具体作用：反向传播，梯度下降时的迭代因子）
loss = 8.25829 代价函数的取值。
注意到当迭代次数达到500时，会显示

I0618 21:42:54.186817 34055 solver.cpp:466] Snapshotting to binary proto file examples/SRCNN/SRCNN_iter_500.caffemodel
I0618 21:42:54.197224 34055 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/SRCNN/SRCNN_iter_500.solverstate

这些内容是提示我们，训练500次的模型结果保存的路径。同样的这个参数我们也可以设置，下面会说。你看其实到这里caffe的训练过程就完成。当然还有很多细节之处我们不知道。首先我们传给caffe的参数 SRCNN_solver.prototxt 文件里写了什么？
SRCNN_solver.prototxt

# The train/test net protocol buffer definition
net: "examples/SRCNN/SRCNN_net.prototxt"    # 网络结构的描述文件
test_iter: 556 
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.0001
momentum: 0.9
weight_decay: 0
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 15000000
# snapshot intermediate results
snapshot: 500
snapshot_prefix: "examples/SRCNN/SRCNN"
# solver mode: CPU or GPU
solver_mode: GPU

我们可以发现：有些参数在我们训练的过程已经有所体现。base_lr，snapshot…..

这里第一行的参数非常重要。net: “examples/SRCNN/SRCNN_net.prototxt”，这个是我们的核心文件，里边描述了具体的网络层次。

name: "SRCNN"
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/SRCNN/train.txt"
    batch_size: 128
  }
  include: { phase: TRAIN }
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/SRCNN/test.txt"
    batch_size: 2
  }
  include: { phase: TEST }
}

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 0.1
  }
  convolution_param {
    num_output: 64
    kernel_size: 9
    stride: 1
    pad: 0
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}

layer {
  name: "conv2"
  type: "Convolution"
  bottom: "conv1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 0.1
  }
  convolution_param {
    num_output: 32
    kernel_size: 1
    stride: 1
    pad: 0
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}

layer {
  name: "conv3"
  type: "Convolution"
  bottom: "conv2"
  top: "conv3"
  param {
    lr_mult: 0.1
  }
  param {
    lr_mult: 0.1
  }
  convolution_param {
    num_output: 1
    kernel_size: 5
    stride: 1
    pad: 0
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "conv3"
  bottom: "label"
  top: "loss"
}

这里我们可以看到刚开始说的三层卷积神经网络：conv1，conv2，conv3。

demo_SR.m

close all;
clear all;

%% read ground truth image
im  = imread('Set5/butterfly_GT.bmp');
%im  = imread('Set14\zebra.bmp');

%% set parameters
up_scale = 4;
model = 'model/9-5-5(ImageNet)/x4.mat';
% up_scale = 3;
% model = 'model\9-3-5(ImageNet)\x3.mat';
% up_scale = 3;
% model = 'model\9-1-5(91 images)\x3.mat';
% up_scale = 2;
% model = 'model\9-5-5(ImageNet)\x2.mat'; 
% up_scale = 4;
% model = 'model\9-5-5(ImageNet)\x4.mat';

%% work on illuminance only
if size(im,3)>1
    im = rgb2ycbcr(im);
    im = im(:, :, 1);
end
im_gnd = modcrop(im, up_scale);
im_gnd = single(im_gnd)/255;

%% bicubic interpolation
im_l = imresize(im_gnd, 1/up_scale, 'bicubic');
im_b = imresize(im_l, up_scale, 'bicubic');

%% SRCNN
im_h = SRCNN(model, im_b);

%% remove border
im_h = shave(uint8(im_h * 255), [up_scale, up_scale]);
im_gnd = shave(uint8(im_gnd * 255), [up_scale, up_scale]);
im_b = shave(uint8(im_b * 255), [up_scale, up_scale]);

%% compute PSNR
psnr_bic = compute_psnr(im_gnd,im_b);
psnr_srcnn = compute_psnr(im_gnd,im_h);

%% show results
fprintf('PSNR for Bicubic Interpolation: %f dB\n', psnr_bic);
fprintf('PSNR for SRCNN Reconstruction: %f dB\n', psnr_srcnn);

figure, imshow(im_b); title('Bicubic Interpolation');
figure, imshow(im_h); title('SRCNN Reconstruction');

%imwrite(im_b, ['Bicubic Interpolation' '.bmp']);
%imwrite(im_h, ['SRCNN Reconstruction' '.bmp']);


im  = imread('Set5/butterfly_GT.bmp');
figure, imshow(im); title('origin image');
im_h = cat(3,im_h,im_gnd,im_b);
figure, imshow(im_h); title('SRCNN');

demo_SR.m是用我们训练好的model进行超分辨率重建。嗯，代码里大多数都是matlab的基础命令，耐心读就可以了。
im = rgb2ycbcr(im);的作用是：…..这个应该是超分辨率重建套路吧。
im_h = SRCNN(model, im_b)， SRCNN（）也是我们需要写的函数。里边进行的是通过model的三次卷积运算。
psnr_bic = compute_psnr(im_gnd,im_b) 是计算图像的PSNR。

数据转换
有些时候，我们的输入不是标准的图像，而是其它一些格式，比如：频谱图、特征向量等等，这种情况下LMDB、Leveldb以及ImageData layer等就不好使了，这时候我们就需要一个新的输入接口——HDF5Data.

其实我们可以这样理解，就是caffe训练的数据类型不是.bmp, .jpg(图片格式) 这里我们需要将图片转换成 “HDF5Data” 类型。这里用到了两个函数
generate_test.m ， store2hdf5.m。
generate_test.m

clear;close all;
% settings
folder = 'Train';              
savepath = 'train.h5';         
size_input = 33;
size_label = 21;
scale = 3;
stride = 14;

%% initialization
data = zeros(size_input, size_input, 1, 1);
label = zeros(size_label, size_label, 1, 1);
padding = abs(size_input - size_label)/2;
count = 0;

%% generate data
filepaths = dir(fullfile(folder,'*.bmp'));

for i = 1 : length(filepaths)

    image = imread(fullfile(folder,filepaths(i).name));
    image = rgb2ycbcr(image);
    image = im2double(image(:, :, 1));

    im_label = modcrop(image, scale);
    [hei,wid] = size(im_label);
    im_input = imresize(imresize(im_label,1/scale,'bicubic'),[hei,wid],'bicubic');

    for x = 1 : stride : hei-size_input+1
        for y = 1 :stride : wid-size_input+1

            subim_input = im_input(x : x+size_input-1, y : y+size_input-1);
            subim_label = im_label(x+padding : x+padding+size_label-1, y+padding : y+padding+size_label-1);

            count=count+1;
            data(:, :, 1, count) = subim_input;
            label(:, :, 1, count) = subim_label;
        end
    end
end

order = randperm(count);
data = data(:, :, 1, order);
label = label(:, :, 1, order); 

%% writing to HDF5
chunksz = 128;
created_flag = false;
totalct = 0;

for batchno = 1:floor(count/chunksz)
    last_read=(batchno-1)*chunksz;
    batchdata = data(:,:,1,last_read+1:last_read+chunksz); 
    batchlabs = label(:,:,1,last_read+1:last_read+chunksz);

    startloc = struct('dat',[1,1,1,totalct+1], 'lab', [1,1,1,totalct+1]);
    curr_dat_sz = store2hdf5(savepath, batchdata, batchlabs, ~created_flag, startloc, chunksz); 
    created_flag = true;
    totalct = curr_dat_sz(end);
end
h5disp(savepath);

store2hdf5.m

function [curr_dat_sz, curr_lab_sz] = store2hdf5(filename, data, labels, create, startloc, chunksz)  
  % *data* is W*H*C*N matrix of images should be normalized (e.g. to lie between 0 and 1) beforehand
  % *label* is D*N matrix of labels (D labels per sample) 
  % *create* [0/1] specifies whether to create file newly or to append to previously created file, useful to store information in batches when a dataset is too big to be held in memory  (default: 1)
  % *startloc* (point at which to start writing data). By default, 
  % if create=1 (create mode), startloc.data=[1 1 1 1], and startloc.lab=[1 1]; 
  % if create=0 (append mode), startloc.data=[1 1 1 K+1], and startloc.lab = [1 K+1]; where K is the current number of samples stored in the HDF
  % chunksz (used only in create mode), specifies number of samples to be stored per chunk (see HDF5 documentation on chunking) for creating HDF5 files with unbounded maximum size - TLDR; higher chunk sizes allow faster read-write operations 

  % verify that format is right
  dat_dims=size(data);
  lab_dims=size(labels);
  num_samples=dat_dims(end);

  assert(lab_dims(end)==num_samples, 'Number of samples should be matched between data and labels');

  if ~exist('create','var')
    create=true;
  end


  if create
    %fprintf('Creating dataset with %d samples\n', num_samples);
    if ~exist('chunksz', 'var')
      chunksz=1000;
    end
    if exist(filename, 'file')
      fprintf('Warning: replacing existing file %s \n', filename);
      delete(filename);
    end      
    h5create(filename, '/data', [dat_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [dat_dims(1:end-1) chunksz]); % width, height, channels, number 
    h5create(filename, '/label', [lab_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [lab_dims(1:end-1) chunksz]); % width, height, channels, number 
    if ~exist('startloc','var') 
      startloc.dat=[ones(1,length(dat_dims)-1), 1];
      startloc.lab=[ones(1,length(lab_dims)-1), 1];
    end 
  else  % append mode
    if ~exist('startloc','var')
      info=h5info(filename);
      prev_dat_sz=info.Datasets(1).Dataspace.Size;
      prev_lab_sz=info.Datasets(2).Dataspace.Size;
      assert(prev_dat_sz(1:end-1)==dat_dims(1:end-1), 'Data dimensions must match existing dimensions in dataset');
      assert(prev_lab_sz(1:end-1)==lab_dims(1:end-1), 'Label dimensions must match existing dimensions in dataset');
      startloc.dat=[ones(1,length(dat_dims)-1), prev_dat_sz(end)+1];
      startloc.lab=[ones(1,length(lab_dims)-1), prev_lab_sz(end)+1];
    end
  end

  if ~isempty(data)
    h5write(filename, '/data', single(data), startloc.dat, size(data));
    h5write(filename, '/label', single(labels), startloc.lab, size(labels));  
  end

  if nargout
    info=h5info(filename);
    curr_dat_sz=info.Datasets(1).Dataspace.Size;
    curr_lab_sz=info.Datasets(2).Dataspace.Size;
  end
end

未完待续

猜你喜欢