Original link: http://tecdat.cn/?p=11159
Precision and recall information retrieval from, but is also used in machine learning is provided. However, in some cases, may have problems using precision and recall. In this article, I will discuss the recall and precision of shortcomings, and explains why the sensitivity and specificity is usually more useful.
definition
Category 1 and 0 for a binary classification problem, the resulting confusion matrix having the structure:
Prediction / Reference | 1 | 0 |
---|---|---|
1 | TP | FP |
0 | FN | TN |
Where TP represents the number of true positives (model correctly predicted positive class), FP represents a positive number of false (model incorrectly predicted positive class), FN indicates a negative number (model incorrectly predicted negative category) false, TN represents the true number of negative (model correctly predicted negative category). Sensitivity (recall), accuracy (positive predictive value, the PPV) and specificity (true negative rate, TNV) is defined as follows:
Determining the sensitivity of a correct prediction rate was observed from the positive results of classification, and the accuracy rate indicates the correct prediction is correct. On the other hand, the specificity is based on the number of false positives, which represents the rate of correctly predicted the results from the observation of negative category.
The sensitivity and specificity advantages
Based on a model to evaluate the sensitivity and specificity for most data sets, because these measures will consider all entries confusion matrix. The sensitivity of processing true and false positives and false negatives, specificity false positive and false negative treatment. This means that when taking into account the true positive and negative, the sensitivity and specificity of the binding is a holistic approach.
Sensitivity and specificity can be summarized in a single amount, i.e., the balance of accuracy, which is defined as the average of the two methods:
Balancing accuracy in the range [0,1] [0,1], where the values 0 and 1, respectively, represent the worst and the best classifier classifier.
Disadvantage of recall and precision
Using recall and precision evaluation model does not use a confusion matrix of all cells. The recall process is certainly true and false negative, and precision handling is certainly true and false positives. Therefore, the use of performance measures which would not be considered a real negative impact. Thus, the precision and recall rate should be used in the negative category correctly identify ineffective. Accuracy may be defined as
Precision and recall are usually grouped into a single number, i.e., F1 Score:
F1 in the range [0,1] [0,1] for the classifier, to maximize the precision and recall, will be 1. Since F1 based on the average score, and is therefore very sensitive to the accuracy of the recall rate different values. Suppose classifier sensitivity 90%, accuracy 30%. Then the routine will be the average
example
Here, I offer two examples. The first example will study the accuracy problem may occur when used as a performance indicator.
What would be a problem when using precision?
When there are few observations certainly belong to the category, precision is a particularly bad measure. Let us assume that a set of clinical data, 90% of which 90% of people sick (positive), only 10% of 10% of health (negative). Let's assume that we have developed two tests to sickness and in health classify patients. The accuracy of both tests are 80%, but will produce different types of errors.
The first test of the confusion matrix
Prediction / Reference | ill | health |
---|---|---|
ill | TP = 80 | FP = 10 |
health | FN = 10 | TN = 0 |
Second test of confusion matrix
Prediction / Reference | ill | health |
---|---|---|
ill | TP = 70 | FP = 0 |
health | FN = 20 | TN = 10 |
Comparison of two tests
Let us compare the performance of two tests:
measuring | Test 1 | Test 2 |
---|---|---|
Sensitivity (recall) | 88.9% | 77.7% |
Specific | 0% | 100% |
accurate | 88.9% | 100% |
Considering the sensitivity and specificity, we will not select the first test, because it is only the balance of accuracy
However, the use of precision and recall, F1 of the first test score
Let's consider an example of information retrieval, to illustrate useful when precision is the standard. Suppose we want to compare the two have a document retrieval algorithm accuracy of 80%.
The first confusion matrix algorithm
Prediction / Reference | Related | irrelevant |
---|---|---|
Related | TP = 25 | FP = 15 |
irrelevant | FN = 5 | TN = 55 |
The second confusion matrix algorithm
Prediction / Reference | Related | irrelevant |
---|---|---|
Related | TP = 20 | FP = 10 |
irrelevant | FN = 10 | TN = 60 |
Comparison of two algorithms
Let's calculate the two algorithms based on performance confusion matrix:
measuring | Algorithm 1 | Algorithm 2 |
---|---|---|
Sensitivity (recall) | 83.3% | 66.7% |
Specific | 78.6% | 85.7% |
accurate | 62.5% | 66.7% |
Precision balance | 80.95% | 76.2% |
F1 points | 71.4% | 66.7% |
In this example, the balance of precision and F1 scores will lead to the first preferred algorithm rather than the second algorithm. Please note that the accuracy of the report is far greater than the balance F1 scores. This is because a large number of discarded observed from the negative kind of specificity of these two algorithms are high. As the F1 scores are not considered true negative rates, some degree of precision and recall than the sensitivity and specificity is more suitable for this task.
Summary
In this article, we see that performance indicators should be chosen carefully. Although the sensitivity and specificity generally performed well, but the precision and recall rate should be used in the case of true-negative rate does not work.
If you have any questions, please leave a comment below.
Big Data tribe - Chinese professional third-party data service providers to provide customized one-stop data mining and statistical analysis consultancy services
Statistical analysis and data mining consulting services: y0.cn/teradat (Consulting Services, please contact the official website customer service )
[Service] Scene
Research; the company outsourcing; online and offline one training; data reptile collection; academic research; report writing; market research.
[Tribe] big data to provide customized one-stop data mining and statistical analysis consultancy
Welcome to elective our R language data analysis will be mining will know the course!
If you have any questions, please leave a comment below.
Big Data tribe - Chinese professional third-party data service providers to provide customized one-stop data mining and statistical analysis consultancy services
Statistical analysis and data mining consulting services: y0.cn/teradat (Consulting Services, please contact the official website customer service )
[Service] Scene
Research; the company outsourcing; online and offline one training; data reptile collection; academic research; report writing; market research.
[Tribe] big data to provide customized one-stop data mining and statistical analysis consultancy
Welcome to elective our R language data analysis will be mining will know the course!