Ablation Studies

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Julialove102123/article/details/88996478

请看Quora上对于ablation study的解释

高赞答案:

An ablation study typically refers to removing some “feature” of the model or algorithm, and seeing how that affects performance.

Examples:

  • An LSTM has 4 gates: feature, input, output, forget. We might ask: are all 4 necessary? What if I remove one? Indeed, lots of experimentation has gone into LSTM variants, the GRU being a notable example (which is simpler).
  • If certain tricks are used to get an algorithm to work, it’s useful to know whether the algorithm is robust to removing these tricks. For example, DeepMind’s original DQN paper reports using (1) only periodically updating the reference network and (2) using a replay buffer rather than updating online. It’s very useful for the research community to know that both these tricks are necessary, in order to build on top of these results.
  • If an algorithm is a modification of a previous work, and has multiple differences, researchers want to know what the key difference is. 
    Simpler is better (inductive prior towards simpler model classes). If you can get the same performance with two models, prefer the simpler one.

知乎上简单的解释:

模型简化测试。 
看看取消掉一些模块后性能有没有影响。 
根据奥卡姆剃刀法则,简单和复杂的方法能达到一样的效果,那么简单的方法更可靠。 
实际上ablation study就是为了研究模型中所提出的一些结构是否有效而设计的实验。 
比如你提出了某某结构,但是要想确定这个结构是否有利于最终的效果,那就要将去掉该结构的网络与加上该结构的网络所得到的结果进行对比,这就是ablation study

知乎上的朋友们是这样子解释的:
链接:https://www.zhihu.com/question/263837982/answer/273653126

第一种解释:比如你弄了个目标检测的pipeline用了A, B, C,然后效果还不错,但你并不知道A, B, C各自到底起了多大的作用,可能B效率很低同时精度很好,也可能A和B彼此相互促进。
Ablation experiment就是用来告诉你或者读者整个流程里面的关键部分到底起了多大作用,就像Ross将RPN换成SS进行对比实验,以及与不共享主干网络进行对比,就是为了给读者更直观的数据来说明算法的有效性。

第二种解释:你朋友说你今天的样子很帅,你想知道发型、上衣和裤子分别起了多大的作用,于是你换了几个发型,你朋友说还是挺帅的,你又换了件上衣,你朋友说不帅了,看来这件衣服还挺重要的。

我的解释:即为对照实验,看每个因素对结果的影响。
 

猜你喜欢

转载自blog.csdn.net/Julialove102123/article/details/88996478