The job: Andrew Ng of CNN to build a convolution neural network model and the application of (1 & 2) finishing the job reference directory bloggers: https: //blog.csdn.net/u013733326/article/details/79827273
To achieve today's digital expression gesture recognition gesture
My Git portal: https: //github.com/VVV-LHY/deeplearning.ai/tree/master/CNN/RecognizeGestureNum
Day17 originally used numpy then made up to the mass before a convolution layer and pooled layer, then back-propagation is more trouble, the Andrew Ng classes started using the tensorflow and kears. So I started with the completion of pytorch CNN programming job, but because of Andrew's example and use all data inside load tensorflow, I decided to take the initiative themselves transformed pytorch framework. (Prehistoric giant pit, because not used pytorch, met yesterday to today, a lot of sinkhole)
1.pytorch default image processing order
NxCxHxW mean channel C in front of our previous approach is numpy and other reading of default channel in the final surface, you need the following:
np of the array has an array transpose methods, such as:
The original channel is a = (N, H, W, C)
Transforming a = a.transpose (0, 3, 1, 2) the same meaning N, C, respectively, after a shift change from the third to the second, W and C
The new a = (N, C, H, W)
2.torchvision dataset inside the package and dataloader
The former is the combined data and the tag and may return to the length of the dataset, the method may also be converted inside the Tensor can also compress (0,255) (0,1) with torchvision.transfroms, official documents can be seen there are many ways
dataloader can load the dataset and do shuffle shuffle mini batch processing, etc.
But dataset class needs its own reconstruction inside the method, specific can search for themselves.
3. The loss function error problem
multi-target not supported at /opt/conda/conda-bld/pytorch_1556653114079/wor
The reason is that you loss_func tag label dimensions do not need is a row vector of tensor
yourlabel.squeeze(1)
Note that it must be a 1,0 becomes a column vector.
plus: almost all novice error in the data structure above, since the neural network architecture as long as one level considered good (n + 2p-kernel_size / s) + 1 is definitely not wrong, hard to re-draw a large network Figure written parameters can not be wrong
But for the novice not used to transfer directly from numpy tensor may be because of different methods of error in the data structure often, it will begin to load data from clear what action every time there is for the data itself shape or size influences.