Machine Learning Yearning(Andrew Ng)

https://gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/5dd91615-3b3f-4f5d-bbfb-4ebd8608d330/Ng_MLY01_13.pdf(书)

深度学习优化策略/经验

策略1：Choose dev and test sets to reflect data you expect to get in the future and want to do well on.（比如验证集dev和测试集test都用我们可能想要测试的一些图片，人脸可以用现实生活中拍到的人脸做dev和test）

策略1中可能遇到的问题：There is a chance that your team will build something that works well on the dev set, only to find that it does poorly on the test set.可运用策略2。

策略2.1：suppose your team develops a system that works well on the dev set but not the test set. If your dev and test sets had come from the same distribution, then you would have a very clear diagnosis of what went wrong: You have overfit the dev set. The obvious cure is to get more dev set data.（dev和test来自于同一分布distribution，我想应该就是场景类似的意思）

策略2.2：It is an important research problem to develop learning algorithms that are trained on one distribution and generalize well to another. But if your goal is to make progress on a specific machine learning application rather than make research progress, I recommend trying to choose dev and test sets that are drawn from the same distribution. This will make your team more efficient.（这里作者的意思要么研究算法；要么就直接把不同分布的dev和test改变一下，以至于两个集合分布一致）

以上是前6节我看到有用的

策略3：The dev set should be large enough to detect differences between algorithms that you are trying out.（比如100张图的dev对于两个分类器，仅仅只有0.1%的差距，这并不能区分开两分类器的性能差异，一般需要1000-10000之间多种测试才能测出算法差异性，如果算法差异0.01%都很重要的话，建议10000+的数据做dev）

策略4：One popular heuristic had been to use 30% of your data for your test set.（一般30%，实际数据量太大，如billion级别的，就不需要那么多了）

策略5：This gives four metrics. By taking an average or weighted average of these four numbers, you end up with a single number metric.（多种评判标准时，可以取平均值或者带权重的平均值，如precision和recall就可以结合为一个平均值作为评判标准，被称为F1分数）

以上是前12节我看到有用的（主要是关于选择dev set 和 test set 以及算法度量方式）

策略6：build and train a basic system quickly(感觉就是要先把简单的系统搭建起来，后续再考虑细节)

策略7：1）Gather a sample of 100 dev set examples that your system misclassified . I.e., examples that your system made an error on. 2）Look at these examples manually, and count what fraction of them are error images.（手动查看错误评估错分的图像是否值得花更多的时间去提高错分图的效果，如某一类50%分错，是值得的，若仅有5%分错，则不值得花很多时间去提升）

Machine Learning Yearning(Andrew Ng)

猜你喜欢