Deep Learning Workflow吴恩达深度学习课程笔记

Step 0: Data division(training, dev(target), test, train-dev set)- Data mismatch
- artificial data synthesis
Single real number evaluation metric( optimizing and satisficing metrics)
Approaches to update parameters(panda, caviar)

Step 1: Model selection( hyperparameter tuning included) and parameter initialization
-Softmax regression
-Transfer learning(small or lots of data)
Note: find an architecture from open source and transfer learning. Freeze previous layers(depend on data size) and save to disk
-multitask learning
-CNN—convolution
–pooling(reduce size of representation to speed up computation)
–fully connected
Parameter initialization: zeros, random, He, Xavier

Step 2: Optimization algorithm (mini-batch, momentum, RMSprop, Adam)( hyperparameter tuning included)

Step 3: Speed up training- input normalization, input vectorization, learning rate decay(slowly reduce learning rate over time, not preferable), a different optimization algorithm, batch normalization
Gradient checking(once after several iterations during training, slow, once prove correctness, turn off)- avoid gradient exploding or vanishing(random weight initialization, intermediate normalization, skip connections(ResNets) for DNN ,auxiliary classifiers)
Keep iterating to improve dev set performance and run gradient descent longer
Ensure convergence

Step 4: (training data) High bias -train it longer
-bigger network
-(NN architecture search not recommended due to performance hard to predict)/hyperparameters search
(training data) high variance - more training data
-regularization(L1,L2(lambda bigger might cause oversmooth and bias) ,dropout(once every iteration& only in training) , maxout, data augmentation, early stopping)
-(NN architecture search not recommended)/hyperparameters search
Test set not fit well(over-fit of dev set) - bigger dev set
Not perform well in real world - change either the dev set or the cost function

Step 5: Manual error analysis to prioritize
- decide what’s most important promising direction

猜你喜欢

转载自blog.csdn.net/weixin_42388228/article/details/86129481