What is Precision in Data Science?
Definitions
Assuming we have a model that distinguishes between positive $P$ and negative $N$ in a classification problem, let’s define the number of correctly identified positives as true positive $TP$, correctly identified negatives as true negative $TN$, incorrectly identified positives (as negative) as false negative $FN$, and incorrectly identified negatives (as positive) as false positive $FP$.
Mathematical Definition
The following value is referred to as the model’s Precision. $$ \textrm{Precision} := {{ TP } \over { TP + FP }} $$
Explanation
Precision measures how accurate the model is when it identifies a case as positive. Although similar to and often confusing with accuracy, it’s better to memorize their English terms, accuracy and precision, to avoid confusion.
Situations Where Precision is Important
- Start by reading this post: Situations Where Accuracy is Overestimated in Data Science
From the above considerations, it’s understandable that accuracy isn’t the sole measure for evaluating a classification model’s efficacy. A model claiming “I have high precision” essentially asserts, “If I label something as positive, it definitely is.” Precision is often considered a performance metric in scenarios where type 1 errors are more critical:
- Spam emails: An often cited example where precision is critical. If the model identifies an email as spam, it must do so with considerable certainty. Having spam emails pile up is not a huge problem, but failing to deliver crucial information to the user due to misclassification can cause serious issues.
- Legal judgements: Although not entirely substituted by machine models, considering a judge’s decision on guilt (positive) or innocence in a binary classification problem is appropriate. If a judge A’s precision is 100%, it means that while the exact number of released criminals is unknown, no innocent person has been wrongfully convicted. In legal terms, ‘accuracy in most decisions’ is harder to value over ’not sacrificing innocents’.
Situations Where Precision is Overestimated
Like accuracy, precision is not a panacea. Looking at precision’s denominator, $(TP + FP)$, it consists of ’the count of positive identifications,’ whether true positive or false positive. This means regardless of how many actual positive cases there are, the denominator can reduce, suggesting the figure might be manipulated by conservatively identifying positive cases.
For instance, if model A identifies 9 out of 100 positive samples as positive and misses one, then its precision is $$ {{ 9 } \over { 9 + 1 }} = 90 \% $$ Even though $90 \%$ looks somewhat respectable, the remaining 91 positive samples are disregarded. On the other hand, let another model B identify 90 out of the positive samples correctly, missing the rest 10, result in a precision of $$ {{ 90 } \over { 90 + 10 }} = 90 \% $$ Models A and B are evaluated to have equivalent performance in terms of precision, which may seem unfair to B. However, strictly speaking, their precision isn’t misrepresented; the reliability of their positive identifications is measured equally, and there’s nothing incorrect about that interpretation itself.
Yet, if one feels “B is better than A”, then it indicates an understanding of how precision can be overestimated. In simple terms, A is overly cautious, prudent but somewhat frustrating. In a way, the opposite of this is recall, and considering both their drawbacks simultaneously gives rise to the $F_{1}$ score.