Perception with Confidence: A Conformal Prediction Perspective


Deep learning has witnessed great success in computer vision, and large-scale data-driven models are becoming prevalent in modern machine perception systems. However, a critical drawback of learned perception modules is that they come with little performance guarantees and are often brittle when deployed in complex environments. Although a growing body of literature is trying to endow deep learning with robustness guarantees, most methods either face scalability limitations or require extra assumptions on the structure of the problem.

In this talk, I advocate conformal prediction as a general and scalable framework to equip black-box perception models with statistical guarantees.
I will first review the statistical machinery of inductive conformal prediction (ICP) that, given a calibration set and a nonconformity function, can convert any learned prediction function into a prediction set that provably covers the true label with a user-specified probability (e.g., 90%) under the mild condition of exchangeability.
I will then present in detail a case study to demonstrate how ICP can be applied to the problem of image-based object detection and pose estimation. In the first stage, with properly designed nonconformity functions, we use ICP to successfully conformalize state-of-the-art keypoint detectors to obtain circular or elliptical prediction sets with guaranteed probabilistic coverage of the true keypoints. In the second stage, we use geometric techniques to propagate the uncertainty in keypoints to derive uncertainty in pose estimation — the result of which is a pose uncertainty set (PURSE) that probabilistically contains the ground truth pose. We demonstrate correct pose estimation and uncertainty quantification on the LineMOD occlusion dataset. I conclude by remarking how this opens up new opportunities in designing perception-based control systems with performance guarantees.
Joint work with Marco Pavone from Stanford/NVIDIA.