Conformal Prediction for Machine Learning Classification -From the Ground Up
, and how to balance coverage (recall) across classes
This blog post is inspired by Chris Molner's book – Introduction to Conformal Prediction with Python. Chris is brilliant at making new machine learning techniques accessible to others. I'd especially also recommend his books on Explainable Machine Learning.
A GitHub repository with the full code (and a link to running the code online) may be found here: Conformal Prediction.
What is Conformal Prediction?
Conformal prediction is both a method of uncertainty quantification, and a method of classifying instances (which may be fine-tuned for classes or subgroups). Uncertainty is conveyed by Classification being in sets of potential classes rather than single predictions.
Conformal prediction specifies a coverage, which specifies the probability that the true outcome is covered by the prediction region. The interpretation of prediction regions in conformal prediction depends on the task. For classification we get prediction sets, while for regression we get prediction intervals.
Below is an example of the difference between ‘traditional' classification (balance of likelihood) and Conformal Prediction (sets).

The advantages of this method are:
- Guaranteed coverage: Prediction sets generated by conformal prediction come with coverage guarantees of the true outcome – that is that they will detect whatever percentage of true values you set as a minimum target coverage. Conformal prediction does not depend on a well calibrated model – the only thing that matters is that, like all Machine Learning, the new samples being classified must come from similar data distributions to the training and calibration data. Coverage can also be guaranteed across classes or subgroups, though this takes an extra step in the method which we will cover.
- Easy to use: Conformal prediction approaches can be implemented from scratch with just a few lines of code, as we will do here.
- Model-agnostic: Conformal prediction works with any machine learning model. It uses the normal outputs of whatever you preferred model is.
- Distribution-free: Conformal prediction makes no assumptions about underlying distributions of data; it is a non-parametric method.
- No retraining required: Conformal prediction can be used without retraining your model. It is another way of looking at, and using, model outputs.
- Broad application: conformal prediction works for tabular data classification, image or time-series classification, regression, and many other tasks, though we will demonstrate just classification here.
Why should we care about uncertainty quantificiation?
Uncertainty quantification is essential in many situations:
- When we use model predictions to make decisions. How sure are we of those predictions? Is using just ‘most likely class' good enough for the task we have?
- When we want to communicate the uncertainty associated with our predictions to stakeholders, without talking about probabilities or odds, or even log-odds!
Alpha in conformal prediction – describes coverage
Coverage is key to conformal prediction. In classification it is the normal region of data that a particular class inhabits. Coverage is equivalent to sensitivity or recall; it is the proportion of observed values that are identified in the classification sets. We can tighten or loosen the area of coverage by adjusting