Depth study of each scene evaluation index summary

The following table summarizes the evaluation index of machine learning used in common scenarios:

(PS: the original document contains screenshots uploaded to the personal resources)

category Scenes Scene Description Scenarios Index Description 
image Image Classification  Identify whether the figure is a class of object / state / scenarios for a single picture content, you need the whole picture to the scene classification   1, image content retrieval: identify various objects customized training needs, combined with business information to show richer recognition results
2, picture review: customized image audit rules, such as smoking scenes in live training and other irregularities
3, sorting or manufacturing quality: custom line identification of various products, so as to realize automatic sorting quality or
4, medical diagnosis: identifying customized medical images, diagnosis visually assist doctors 
Accuracy rate (ratio of the total number of samples and the number of samples correctly classified) Accuracy rate (in terms of a category for the correct prediction than the total number of samples in the sample number and the category prediction for the category, here is an average accuracy rate of each category) Recall (of any class in terms of the total number of samples than the number of samples correctly predicted that category and that category is, in this case the average recall rate of each category) F1-score (on a category in terms of harmonic mean of precision and recall, where the average number of each class F1-score) Different categories of F1-score (if different classification of F1-score under very different circumstances exist, the overall effect is likely to be down model of the effect of low classification accuracy is recommended to check the number of cases in the training data target different labels, it is recommended target number of different categories of balanced as possible.) top1-top5 accuracy rate (for image files each assessment, based on the confidence level of the model will in turn be given the recognition result top1-top5, where top1 highest confidence level, the lowest confidence level top5 The exact values of less top1 refers to the evaluation criteria for "TOP1 result of recognition is correct, it is determined that the correct" accuracy given value is .top2 accuracy evaluation criteria for "TOP1 hit as long as one or top2 correct result, i.e., it is determined that the correct" accuracy given. ...... and so on.)
Object detection  It can detect all the pictures inside of the target object name, location. Suitable for a picture to identify a plurality of objects, counting objects in the scene.   1, video surveillance: detect whether there are irregularities as objects, behaviors appear
2, industrial quality control: such as the number and position detection of minor flaws in the picture
3, medical diagnosis: medical cell count, herbs recognition 
mAP (mAP (Mean Average Precision) is a detection object (Object Detection) algorithm metrics measure the effect algorithm for object detection task, each class object can calculate the precise ratio (Precision) and recall (the Recall), in calculating a plurality of times under different thresholds / test, each PR class can get a curve, area under the curve is the average) Precision ratio (F1-score by comparing the results of the highest threshold value of 0.2 in. And the total number of objects than the object under prediction of the exact threshold was correctly predicted) Recall (F1-score comparison result by the highest threshold value of 0.2. Recall rate is lower than the number of objects of the threshold number of correctly predicted the real object)  F1-score at different threshold values   The average accuracy of different tag (valid observations may compare the accuracy of different tag by looking at the average accuracy of different tag. If the next great accuracy the presence of differences, the overall effect is likely to be down model of the effect of low accuracy of the label recommended inspection training data in case the target number of different labels, the proposed target number of different labels balanced as possible.)  mAP different labels
Image segmentation  在图中包含多个目标时,识别每个目标的名称、位置(像素级),按目标名称计数。适合图中有多个目标、需用多边形标注或需识别目标轮廓的场景。   1、专业检测:应用于专业场景的图像分析,比如在卫星图像中识别建筑、道路、森林,或在医学图像中定位病灶、测量面积等
2、智能交通:识别道路信息,包括车道标记、交通标志等 
mAP 精确率 召回率  不同阈值下的F1-score   不同标签的平均精度  不同标签的mAP
文本 文本分类  基于自建分类体系的机器学习方法,可实现文本自动分类。   1、投诉信息分类:训练客服投诉信息的自动分类,将每个用户投诉的内容进行分类管理,节省大量客服人力
2、媒体文章分类:训练网络媒体文章的自动分类,进而实现各类文章的自动分类
3、文本审核:定制训练文本审核的模型,如训练文本中是否含有违规/偏激性质的描述
4、其他:尽情脑洞大开,训练你希望实现的文本分类模型 
准确率 精确率 召回率  F1-score   不同分类的F1-score  top1-top5准确率
短文本匹配  可判断两段短文本的相似度   在客服问答场景中,通过训练短文本匹配的模型,能够快速识别出知识库中与用户问题相似的相关问题,并推荐出相应的答案,快速提升客服问答效率  准确率 精确率 召回率  F1-score   各分类的F1-score/各分类的精确值/各分类的召回率   
序列标注  通过定制标签,实现对一串文本中的每个序列单元进行分类标注。能够实现分词,词性标注,命名实体识别,关键词抽取,词义角色标注等应用功能   可以用于金融场景、医疗场景等的关键信息抽取,或者对对话的关键词槽的识别应用    精确率 召回率  F1-score     
视频 视频分类  可以分析视频的内容,识别出视频内人体做的是什么动作,物体/环境发生了什么变化。   1、人体动作监控:定制监控人体特殊动作,比如特殊手势,工地/后厨人员行为等
2、环境变化监控:定制监控环境变化,比如山体塌方,泥石流等
3、视频内容分析:快速分析视频内容,可用于短视频APP和直播平台中
4、物体状态变化监控:定制识别特定物体的移动方向、形态变化等 
准确率 精确率 召回率  F1-score   不同分类的F1-score  top1-top5准确率
声音 声音分类  可以定制识别出当前音频是哪种声音,或者是什么状态/场景的声音   1、安防监控:定制识别不同的异常或正常的声音,进而用于突发状况预警。比如监控在工业生产场景中监控是否出现了异常噪音,从而辅助人工测试的时候判断是否出现bug。
2、科学研究:定制识别同一物种的不同个体的声音、或者不同物种的声音,协助野外作业研究。比如动物研究机构从野外采集的声音,借助于EasyDL声音分类模型,判断当前音频属于什么物种。 
准确率 精确率 召回率  F1-score   不同分类的F1-score  top1-top5准确率
发布了227 篇原创文章 · 获赞 94 · 访问量 54万+

Guess you like

Origin blog.csdn.net/wiborgite/article/details/104927944