Deep learning personal organization

insert image description here

deep learning

concept

DL Deep Learning A branch of machine learning

Machine Learning Classification

  • supervised learning

    • Features: Known category data learning

    • Whether the dependent variable is continuous

      • Classification

        • continuous
        • house price, weight, weather
      • return

        • Discontinuous
        • whether you have cancer
    • algorithm

      • k-Nearest Neighbor Algorithm KNN

      • decision tree

      • Support Vector Machine SVM

      • Neural Networks

      • linear model

        • Linear Regression Linear Regression
        • Logistic regression Logistic Regression
      • Naive Bayes

      • random forest

  • unsupervised learning

    • Features: Unknown category data learning

    • algorithm

      • k-means clustering
      • Hierarchical Clustering Algorithm
      • Maximum Expectation Algorithm
      • Principal Component Analysis PCA
  • semi-supervised learning

    • Learn using both known and unknown data

Neural Networks

  • Linear Model of Neurons

    • y = wx + b
  • BP neural network working process

    • forward propagation

      • Calculate the process of the neural network from input to output, and calculate the difference between the predicted value and the actual value

      • process

        • neural network node

          • Neuron model, linear change, a linear change to a layer of network
        • Delinearization

          • Using a nonlinear function (activation function), the input is the output of the neural network node

          • Common activation functions

            • resume

              • function expression

relu ( x ) = { x , x>0 0 , other relu(x)=\begin{cases} x,& \text{x>0}\\ 0,& \text{other} \end{cases}read again ( x ) _={ x,0,x>0Other

					- 导数

						- 

ddxrelu ( x ) = { 1 , x>0 0 , other \frac{d}{dx}relu(x)=\begin{cases} 1,& \text{x>0}\\ 0,& \text{ other} \end{cases}dxdread again ( x ) _={ 1,0,x>0Other

					- 问题

						- x 为负数时出现梯度消失

				- Sigmoid

					- 函数表达式

						- 

s i g m o i d ( x ) = 1 1 + e − x sigmoid(x)=\frac{1}{1+e^{-x}} sigmoid(x)=1+ex1

					- 导数

						- 

d d x σ ( x ) = σ ( 1 − σ ) , σ ( x ) = s i g m o i d ( x ) \frac{d}{dx}\sigma(x)=\sigma (1-\sigma),\sigma(x)=sigmoid(x) dxdσ ( x )=s ( 1p ) ,σ ( x )=sigmoid(x)

					- 问题

						- 输入过大或过小,容易出现梯度消失问题

					- 值域

						- 值域区间为[0,1],一般为2分类输出层激活函数

				- Tanh

					- 函数表达式

						- 

tanh ( x ) = ex − e − xex + e − x tanh(x)=\frac{e^xe^{-x}}{e^x+e^{-x}}t english ( x )=ex+exexex

					- 导数

						- 

ddxtanh ( x ) = 1 − tanh 2 ( x ) \frac{d}{dx}tanh(x)=1-tanh^2(x)dxdt english ( x )=1I love you2(x)

					- 值域

						- 值域区间为[-1,-1]

				- softmax

					- 

y j = e z j ∑ j = 1 k e z j ( j = 1 , 2 , … , k ) y_j = \frac{e^{z_j}}{\sum^k_{j=1}e^{z_j}}(j=1,2,\ldots,k) yj=j=1kezjezj(j=1,2,,k )
- The range of values ​​is [0, 1], and the sum is 1, which is generally the activation function of the multi-classification output layer

				- Leaky ReLU

					- 函数表达式

						- 

leak ( x ) = { x , x ≥ 0 px , x < 0 leak(x)=\begin{cases} x,& x\end 0\\ px,& x\text{ < } 0 \end{cases}leakyrelu(x)={ x,px,x0x < 0

					- 导数

						- 

d d x l e a k y r e l u ( x ) = { 1 , x ≥ 0 p , x  <  0 \frac{d}{dx}leakyrelu(x)=\begin{cases} 1,& x\geq0\\ p,& x\text{ < } 0 \end{cases} dxdleakyrelu(x)={ 1,p,x0x < 0

		- 一般表达式

			- 

h = r e l u ( W x + b ) h=relu(\pmb{W}x+\pmb{b}) h=king l and ( W x+b)

	- 损失函数

		- 均方误差 MES

			- 

M E S = 1 2 N ∑ i = 1 N ( y i − y i ^ ) 2 MES = \frac{1}{2N}\sum_{i=1}^N(y_i-\hat{y_i})^2 MES=2N _1i=1N(yiyi^)2

		- 交叉熵损失

- 反向传播

	- 模型优化

		- 降低误差,更新权值
		- 算法

			- 随机梯度下降 SGD

				- 核心思想损失函数求导
				- 超参数 学习率

	- 权值更新

		- 链式求导法则,求权值偏导
  • fully connected network

    • features

      • Every output node is connected to all input nodes
    • input matrix X

      • The shape of X is [b,din] b is the number of samples, and din is the number of input nodes
    • Weight matrix W

      • The shape of W is [din,dout], and dout is the number of output nodes
    • output matrix O

      • The shape of O is [b,dout]
    • bias matrix b

      • The shape of b is [dout]
  • convolutional neural network

    • convolutional layer

      • convolution operation

        • local correlation

          • Each pixel is more correlated with surrounding pixels
        • Weight sharing

          • use the same weight matrix
        • receptive field

          • The neurons in the visual cortex are not connected to all the neurons in the previous layer, but only perceive the visual signals in one area, and only respond to the visual stimuli in the local area.
        • Multiply and accumulate weights

          • The value of the corresponding position is obtained by multiplying and accumulating the receptive field and the weight matrix
        • The convolution operation can obtain the feature correlation between image pixels

      • Convolution kernel kernel

        • weight matrix
      • stride

        • The length unit of each movement of the receptive field window

          • For 2D convolution, shift the length in the X (rightward) reverse and Y (downward) directions, respectively
        • Reduce receptive field density

      • padding

        • keep the same size as the input
      • Output Size Calculation

h new = h + 2 ∗ ph − ks + 1 h_new=\frac{h+2*p_h-k}{s}+1hnew=sh+2phk+1
-

w 新 = w + 2 ∗ p w − k s + 1 w_新=\frac{w+2*p_w-k}{s}+1 wnew=sw+2pwk+1

	- 一个卷积核只能提取一种特征

- 一般深层卷积神经网络,按照特征图高宽逐渐减少,通道数逐渐增大的经验法则
- 池化层

	- 作用

		- 保留主要特征的同时减少参数和计算量,防止过拟合。
		- 向下采样

	- 分类

		- 全局平均/最大池化池化

			- 获取全局上下文关系
			- 不以窗口的形式取均值,而是以feature map为单位进行均值化。即一个feature map输出一个值。

		- 平均池化

			- 保留背景信息
			- 取窗口内的平均值作为结果

		- 最大池化

			- 提取特征纹理
			- max pooling的前向传播是把patch中最大的值传递给后一层,而其他像素的值直接被舍弃掉。

- 上池化层

	- 上采样是指将图像上采样到更高分辨率(resolution),是一种把低分辨率图像采样成高分辨率图像的技术手段。
	- 方法

		- 最邻近元法
		- 双线性内插法
		- 三次内插法

- BatchNorm层

	- 批标准化BN

		- 解决梯度消失和梯度爆炸的问题

	- 作用

		- 对数据进行规范化,降低样本之间的差异
		- 使激活函数的输入落在梯度较大的区域,一个很小的输入差异也会带来较大的梯度差异,可以有效的避免梯度消失,加快网络的收敛
		- 降低了层与层之间的依赖关系,不加BN的时候当前层会直接接收上一层的输出,而加了BN之后当前层接收的是一些规范化的数据,因此使得模型参数更容易训练,同时降低了层与层之间的依赖关系

	- BN层的可学习参数

		- scale(γ),即缩放尺度,用于乘以输入进行缩放。
		- offset(β),即偏移量,用于和输入相加进行偏移。

	- 

- 经典卷积网络

	- LeNet5

		- 创新点

			- 使用卷积替代全连接

	- AlexNet

		- 创新点

			- 层数提升至8层
			- 使用relu ,代替Sigmoid
			- 引入Dropout层,防止过拟合

	- VGG系列

		- 创新点

			- 层数提升至19层
			- 使用更小的3*3卷积核
			- 采用更小的池化层2*2窗口和步长s=2

	- MLP卷积层

		- 网络中添加网络

			- 在卷积之后在添加一个网络
			- 在MLP网络中,比较常见的是使用一个三层的全连接网络,这等效于普通卷积层后再链接1*1的卷积和relu激活函数

	- GooLeNet

		- 创新点

			- Inception块

				- 前身 MLP卷积层
				- 网中网的结构
				- 增加宽度

			- 使用1*1 卷积

				- 不改变图像宽高,只降低通道数

	- ResNet

		- 创新点

			- 残差网络

				- 使用Skip Connetction(跳跃连接),给深层神经网络添加一种回退到浅层神经网络的机制 shortcut
				- 在通道轴c维度进行相加操作
				- 增加深度
				- f(x) = H(x) +x

	- DenseNet

		- 使用Skip Connetction思想
		- 在通道轴c维度进行拼接操作,聚合特征信息
		- 稠密连接块 DenseBlock

- 卷积层变种

	- 空洞卷积

		- 增大感受野

	- 反卷积

		- 向上采样
		- o=(i-1)*s+k-2p

	- 分离卷积

- [卷积神经的发展, 经典网络结构总结](https://blog.csdn.net/qq_23981335/article/details/122538921)

	- FCN

		- 全卷积网络

	- FPNNet

		- 多尺度预测
		- 

	- ResNet

		- 残差网络

			- 所解决问题

				- 网络退化问题

		- ResNet 是通过逐元素相加(element-wise add)和前面特征聚合
		- 

	- [InceptionNet](https://zhuanlan.zhihu.com/p/480384320)

		- 
		- InceptionBlock

	- [SPPNet](https://zhuanlan.zhihu.com/p/79888509)

		- Spatial Pyramid Pooling Network,多尺度融合
		- 
		- 

	- [DenseNet](https://zhuanlan.zhihu.com/p/37189203)

		- 稠密连接网络
		- DenseNet 则是通过拼接(concatenation)的方式
		- 

	- [VoVNet](https://zhuanlan.zhihu.com/p/139517885)

		- One-Shot Aggregation(一次聚合,OSA)
		- 

	- [CSPNet](https://zhuanlan.zhihu.com/p/116611721)

		- Cross Stage Partial Network,跨阶段局部网络
		- 

	- [PANet](https://zhuanlan.zhihu.com/p/373907181)

		- Path Aggregation Network,路径聚合网络
		- 

	- ELAN

		- 
		- 最短最长梯度路径
		- 高效层聚合网络

	- PRN 
	- Focus

		- 

	- [Rep](https://zhuanlan.zhihu.com/p/344324470)

		- 结构重参数化

			- 用一个结构的一组参数转换为另一组参数,并用转换得到的参数来参数化(parameterize)另一个结构。只要参数的转换是等价的,这两个结构的替换就是等价的。

		- 

	- ACNet
	- [Diverse Branch Block](https://zhuanlan.zhihu.com/p/360939086)

		- 

	- [SENet](https://zhuanlan.zhihu.com/p/32702350)

		- 

	- [EfficientNet](https://blog.csdn.net/qq128252/article/details/110953858)

		- 

	- [Ghost Convolution](https://zhuanlan.zhihu.com/p/368832202)

		- 

	- 轻量级网络

		- [MobileNet](https://zhuanlan.zhihu.com/p/394975928)

		- [ShuffleNet](https://zhuanlan.zhihu.com/p/32304419)

		- [SqueezeNet](https://zhuanlan.zhihu.com/p/49465950)

- 生成对抗网络

loss function

focal loss focus loss function

  • Solve the problem of uneven sample distribution

smooth loss

smooth L 1 ( x ) = { 0.5 x 2 , if ∣ x ∣ < 0 |x|-0.5 , otherwise smooth_{L1}(x)= \begin{cases} 0.5x^2, & if |x|<0 \\ \text{|x|-0.5}, & else \end{cases}smoothL 1(x)={ 0.5 x2,|x|-0.5,if x <0otherwise

MSE mean square error

M E S = 1 2 N ∑ i = 1 N ( y i − y i ^ ) 2 MES = \frac{1}{2N}\sum_{i=1}^N(y_i-\hat{y_i})^2 MES=2N _1i=1N(yiyi^)2

MAE mean absolute error

CrossEntropy Loss cross entropy loss

tripletloss

  • Ternary loss function

Ranking loss function: Metric learning

iou loss

  • iou
  • today
  • giou
  • by
  • siou
  • wise-iou

optimizer

Batch Gradient Descent (BGD)

Stochastic Gradient Descent (SGD)

Mini-Batch Gradient Descent (MBGD)

Momentum

Adam

Adadelta

Dosing

Learning rate optimization method Trick

warm up learning rate warm up

cosine annealing

Dataset Enhancement Trick

label smooth label smooth

  • Label smoothing is a modification of a loss function that can improve the accuracy of image classification. The simple explanation is that it adjusts the training target of the neural network from "1" to "1-label smoothing adjustment", which means that the neural network is trained to be less confident in its own answer.

Image Processing

  • single image operation

    • Cutout

      • Use a cutout mask for the input to the first layer of the CNN
    • Random Erasing

      • Replace regions of the image with random values ​​or average pixel values ​​from the training set
    • Hide-and-Seek

      • The image is divided into a grid composed of SxS image patches, and some patches are randomly hidden according to the probability setting, so that the model learns the appearance of the entire object, rather than a single piece, such as not relying solely on the face of an animal for recognition.
    • GridMask

    • FenceMask(2020)

  • FenceMask(2020)

    • mix up
    • Cutmix
    • KeepAugment(2020)
    • Mosaic data augmentation

Trick

multi-scale training

head decoupling

Deep Learning Framework

tensorflow

  • data_format

    • Used to set the channel position
    • 'channels_first' for bchw
    • 'channels_last' for bhwc
  • The batch_size parameter in the input layer sets the batch_size size

pytorch

Target Detection

first stage

  • anchor-base
  • anchor-free

Bounding Box Description

  • bounding box

    • top left and bottom right
    • Center point and length and width
  • Anchor box

    • anchor

      • k-means

YOLO series arrangement

  • anchor-base

    • yolov3

      • backbone

        • Darknet-53

          • resnet

            • CBR(conv+bn+relu)
          • Subtopic 2

        • downsampling layer

          • No pooling, downsampling using convolutional layers with stride 2
      • neck

        • FPN
      • head

        • 3×3 convolutional layer and 1×1 convolutional layer
      • loss

        • Positive and negative sample allocation strategy
    • 4

      • backbone

        • CSPDarknet53

          • csp

            • CBM(conv+bn+mish)
      • neck

        • SPP
        • PANet
      • head

      • loss

    • yolov5

    • yolov7

  • anchor-free

    • yolox
    • yolov6
    • 8
  • Target Detection

    • backbone

      • resnet
      • csp
    • Dataset Enhancement

      • mixup
      • Masaic
      • brightness
      • contrast
      • color space conversion
      • zoom
      • to rotate
      • mirror image
      • void filling
    • neck

      • FPN
      • PANet
      • BiFPN
    • head

      • head decoupling
    • loss

      • Positive and negative sample distribution

two stages

  • R-CNN
  • SPP-Net
  • Fast R-CNN
  • Faster R-CNN

semantic segmentation

FCN

  • Replace the fully connected layer of CNN with a convolutional layer
  • Add upsampling operation (deconvolution)
  • skip connection

dreams

  • dreams

    • The model structure is completely symmetrical
    • Adopt encoding and decoding structure (Encoder-Decoder)
    • U-Net-style skip connection (skip connection)
  • dreams++

    • Add Deep Supervision
    • multi-scale skip connections
  • unet3+

    • Full-scale skip connections

deeplab

  • deeplabv1

    • dilated convolution
    • CRF
  • deeplabv2

    • ASPP
  • deeplabv3

    • Multi-Grid
  • deeplabv3+

SegNet

face recognition

facenet

SiameseNet

model deployment

onnx

OpenVino

tensorRT

ncnn

Guess you like

Origin blog.csdn.net/qq_45723275/article/details/129994619