1. The classic convolutional neural network LeNet
2. Code implementation
Compared with the accuracy of MLP, it is almost the same, the accuracy is 80%+
3. Q&A
-
- Why use view instead of reshape? There is no big difference between view and reshape, and reshape will be faster.
-
- MLP data is relatively large and cannot run, so CNN is used. It will be faster if you can use MLP.
-
- The number of output channels increases, which means that more texture information can be recognized.
-
- The pooling layer generally uses max or avg. If it is to identify the type of items, such as cats and dogs, max may be better. If it is a smooth filter, avg will be better.
-
- The Lua language used by LeNet at the time was implemented.
-
- The accuracy of the neural network is generally higher than the threshold satisfied by the user, such as recognizing speech as text, and the user can receive it.
-
- Texture learned by each layer of CNN
https://poloclub.github.io/cnn-explainer/
- Texture learned by each layer of CNN
reference
https://www.bilibili.com/video/BV1t44y1r7ct?p=1