CONFIRM

Inspiration

I built CONFIRM to solve a validation problem in geophysical consulting - my father's company uses machine learning models to analyze well logs and seismic data. We needed a way to validate whether the models themselves were statistically reliable, not just whether their predictions looked good. When I saw this hackathon's focus on transparent lending, I realized the same problem exists in financial services: banks need to know if their lending models are fundamentally sound or just getting lucky.

What it does

CONFIRM validates machine learning models using chi-square and Cramer's V statistical tests. It grades model reliability A-F, testing the predictor itself rather than just the predictions. While originally built for geophysics, the statistical validation framework works for any ML model - including those used for loan approval, credit risk assessment, and default prediction.

How we built it

Desktop application in Python with statistical analysis frameworks. Originally designed for validating models in oil & gas consulting, now being demonstrated for lending use cases.

Challenges we ran into

Translating complex statistical validation into something non-technical stakeholders can act on. The A-F grading system makes it accessible to risk managers and compliance officers who aren't statisticians. Also, I do not actually know how to code. I built CONFIRM playing telephone between different LLMs, checking and verifying each other's work, and my father who told us what he needed. This is only the second project I had done in that manner and I am still learning how to optimize the workflow.

Accomplishments that we're proud of

Built a cross-industry validation tool that works whether you're analyzing seismic data or loan applications. The core principle is the same: validate the model, not just the predictions. Again, the fact that I have never taken a Computer Science course or coding bootcamp yet I managed to build this program from idea to desktop program is something I am particularly proud of.

What we learned

Model validation is a universal problem. Every industry using ML faces the same question: is my model actually reliable, or did it just get lucky on this dataset? And that the math to prove a model actually learned what it was supposed to is not particularly difficult, what's difficult is getting people to realize that knowing what decisions were taken to reach the prediction is not validation, or at least any validation that would put a 3rd party at ease.

What's next for CONFIRM

Expanding into financial services as a commercial application. The statistical foundation is proven in geophysics - now adapting the interface and workflows for banking use cases. I also want to find niche or unlikelly uses for CONFIRM- a type of metaphorical lifetonguetwister of sorts.