Shortcomings of Accuracy in Machine Learning

# Shortcomings of Accuracy in Machine Learning It is argued in the references below that **accuracy** (as a discontinuous metric) is inherently *flawed* as a metric to measure a machine learned models success. In the case of imbalanced classes, this is even more pronounced! Harrel writes: > The use of a discontinuous improper accuracy score such as proportion “classified” “correctly” has led to countless misleading findings in bioinformatics, machine learning, and data science. In some extreme cases the machine learning expert failed to note that their claimed predictive accuracy was less than that achieved by ignoring the data, e.g., by just predicting Y=1 when the observed prevalence of Y=1 was 0.98 whereas their extensive data analysis yielded an accuracy of 0.97. As discusssed [here](https://www.fharrell.com/post/classification/), fans of “classifiers” sometimes subsample from observations in the most frequent outcome category (here Y=1) to get an artificial 50/50 balance of Y=0 and Y=1 when developing their classifier. Fans of such deficient notions of accuracy fail to realize that their classifier will not apply to a population when a much different prevalence of Y=1 than 0.5. Further, Kolassa writes: > FH and I would argue that no, [we should not aim for a precision/recall tradeoff](https://stats.stackexchange.com/a/312787/1352). Instead, we should be aiming for well-calibrated probabilistic predictions, which can then be used in a decision, along with, and I am repeating myself, the consequences of misclassification and other misdecisions. And as a matter of fact, this is exactly what logistic regression does. It does not care at all about precision or recall. What it cares about is the likelihood. Which is just another way of looking at a probabilistic model. And no, bias is not a desirable trait in this context. --- Date: 20220127 Links to: Tags: References: * [Classification vs. Prediction | Statistical Thinking](https://www.fharrell.com/post/classification/) * [Damage Caused by Classification Accuracy and Other Discontinuous Improper Accuracy Scoring Rules | Statistical Thinking](https://www.fharrell.com/post/class-damage/) * [machine learning - Why is accuracy not the best measure for assessing classification models? - Cross Validated](https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models) * [classification - How does logistic regression "elegantly" handle unbalanced classes? - Cross Validated](https://stats.stackexchange.com/questions/403239/how-does-logistic-regression-elegantly-handle-unbalanced-classes)