Practice Anomaly Detection
第四题 下面哪个是对的?
- When choosing features for an anomaly detection system, it is a good idea to look for features that take on unusually large or small values for (mainly the) anomalous examples.
(当建立异常检测系统选择特征时,寻找那些特别大或特别小的值作为异常样本,这样是可行的) If you are developing an anomaly detection system, there is no way to make use of labeled data to improve your system.
(个人见解:当有了label的时候,可以通过训练集得出模型p(x),然后在cross validation中选择使F1最大的ϵ,从而改进系统)If you have a large labeled training set with many positive examples and many negative examples, the anomaly detection algorithm will likely perform just as well as a supervised learning algorithm such as an SVM.
(如果训练集的正和负样本相当,直接用监督学习就好了;异常检测适用于正样本很少的情况)If you do not have any labeled data (or if all your data has label y=0), then is is still possible to learn p(x), but it may be harder to evaluate the system or choose a good value of ϵ.
(个人见解:没有标签的话,当然比较难选择epsilon,因为不知道哪些是异常值)选了1和4
答案对了!
- When choosing features for an anomaly detection system, it is a good idea to look for features that take on unusually large or small values for (mainly the) anomalous examples.
关于怎么从图看出sigma
从图中可以看出,从u到右边的最低点,大概是3*sigma。这应该是个大概吧。