Precision#
Precision measures the proportion of positive inferences from a model that are correct, ranging from 0 to 1 (where 1 is best).
As shown in this diagram, precision is the fraction of all inferences that are correct:
In the above formula, \(\text{TP}\) is the number of true positive inferences and \(\text{FP}\) is the number of false positive inferences.
Guide: True Positive / False Positive
Read the TP / FP / FN / TN guide if you're not familiar with "TP" and "FP" terminology.
-
API Reference:
precision
↗
Implementation Details#
Precision is used across a wide range of workflows, including classification, object detection, instance segmentation, semantic segmentation, and information retrieval. It is especially useful when the objective is to measure and reduce false positive inferences.
For most workflows, precision is the ratio of the number of correct positive inferences to the total number of positive inferences:
For workflows with a localization component, such as object detection and instance segmentation, see the Geometry Matching guide to learn how to compute true positive and false positive counts.
Examples#
Perfect inferences:
Metric | Value |
---|---|
TP | 20 |
FP | 0 |
Partially correct inferences, where some inferences are correct (TP) and others are incorrect (FP):
Metric | Value |
---|---|
TP | 90 |
FP | 10 |
Zero correct inferences — all positive predictions are incorrect:
Metric | Value |
---|---|
TP | 0 |
FP | 20 |
Multiple Classes#
So far, we have only looked at binary classification/object detection cases, but in multiclass or multi-label cases, precision is computed per class. In the TP / FP / FN / TN guide, we went over multiple-class cases and how these metrics are computed. Once you have these four metrics computed per class, you can compute precision for each class by treating each as a single-class problem.
Aggregating Per-class Metrics#
If you are looking for a single precision score that summarizes model performance across all classes, there are different ways to aggregate per-class precision scores: macro, micro, and weighted. Read more about these methods in the Averaging Methods guide.
Limitations and Biases#
As seen in its formula, precision only takes positive inferences (TP and FP) into account; negative inferences (TN and FN) are not considered. Thus, precision only provides one half of the picture, and should always be used in tandem with recall: recall penalizes false negatives (FN), whereas precision does not.
For a single metric that takes both precision and recall into account, use F1-score, which is the harmonic mean between precision and recall.