In low-level criminal cases (misdemeanor theft, non-violent drug possession, or failure-to-appear), defendants with nearly identical charges and criminal profiles often receive drastically different bail decisions. These decisions are increasingly influenced by AI-aided risk-assessment tools. With historical evidence of racial and other biases in this area, it is crucial to ensure that the increasing presence of AI tools adds fairness and not injustice.

We propose an evaluation framework that benchmarks fairness in AI bail predictions using two metrics: calibration for the socio-economic status of defendants and evaluation against biased outcomes based on historical estimates. Predicting the amount of money charged for bail is the first step when it comes to bail decisions. Our benchmark takes an AI model’s predictions of charge amounts and calibrates it based on the defendant’s income, occupational status, home ownership, and other factors to fine-tune them for better social justice. Next, it compares the updated predictions to a baseline model that uses race, gender, and other predictors historically known for bias in decisions, measuring the difference between the model predictions and the baseline biased predictions. We expect that a good model will calibrate well to defendants’ socio-economic status and differ significantly from biased baseline predictions.

This analysis limits itself to data from 2000 to 2025, predicting fair and accurate amount ranges that are intended as suggestions to judges with considerable confidence. As justice is a complex moral and legal issue, this framework leaves room for human judgment and relies on human evaluation for timely maintenance and updates.

Slide deck: https://docs.google.com/presentation/d/1whMK3cmLfV_5L4NmVlGsaCGXLkGFT6K3Txxw1EfxhVo/edit?slide=id.g3c65ca890d6_17_61#slide=id.g3c65ca890d6_17_61

Built With

  • ppt
Share this project:

Updates