Inspiration

Over 40 million Americans, or 12% of the population, are living with diabetes. Of the 40.1 million people living with diabetes, 29.1 million had been diagnosed, and 11 million are undiagnosed. Diabetes among college students is a growing health crisis, driven by sedentary lifestyles, poor diet, and rising obesity, with nearly 30% of teens now considered pre-diabetic. A startling 39% of high-risk college students underestimate their risk of developing diabetes, while roughly 19% of those with diabetes report severe depression. Every 5 seconds, someone is diagnosed with diabetes. Not warned, not diagnosed. By then, it's often been progressing for a decade. We built an app that changes that. But here's what makes it different: it tells you how much to trust the answer. Every prediction comes with a 90% confidence interval and a volatility flag, built from three independent uncertainty methods, and bootstrap resampling, Bayesian posterior distributions, and quantile regression. If the models disagree, you know. If the data is thin, you know.

What it does

It takes up to 10 input biometrics (some are optional). And runs nine machine learning models in parallel, takes an ensemble of all of them, and returns a risk score from 25-346 in seconds.

How we built it

he ML backend trains a 9-model ensemble (Linear, Ridge, Lasso, ElasticNet, BayesianRidge, DecisionTree, RandomForest, GradBoost, SVR) on the scikitlearn diabetes dataset (442 patients, 10 features) and averages their outputs for robust predictions. The frontend features a clean input form with real-time validation, a 3-column results dashboard showing the risk score with 90% bootstrap confidence intervals, personalized health guidance, and an ablation-based key risk factors chart. The app is deployed on Streamlit Cloud with secrets management for the API key, custom CSS styling, and privacy-aware UX including a state selector for data compliance.

Challenges we ran into

As first-time ML developers, we faced a steep learning curve with sklearn pipelines, feature normalization, and ensemble methods under hackathon time pressure. Our biggest technical blocker was that training all 9 models on every cloud deploy took 10+ seconds, causing timeouts during demos. We solved this by pre-training models locally and pushing the serialized weights to the repo, so the app loads pre-fitted pipelines instantly instead of retraining. We also cut bootstrap iterations from 200 to 50 and parallelized computation across CPU cores, bringing per-prediction latency under 2 seconds and cold-start time under 3 seconds.

Accomplishments that we're proud of

This was our first time working with machine learning, and we're proud that we went from zero ML experience to a fully deployed 9-model ensemble prediction app in a single hackathon. What makes it meaningful is that we built something with real societal impact, a tool that helps everyday people understand their diabetes risk using clinical data, get personalized health guidance, and ask follow-up questions through an AI assistant. We didn't just get a model working; we made it accessible, intuitive, and genuinely useful for someone who might not otherwise have easy access to health risk insights.

What we learned

Beyond the technical skills, learning sklearn pipelines, ensemble methods, the biggest takeaway was how to collaborate like a real engineering team. We learned to break a complex project into parallel tasks, communicate blockers early, and make fast decisions under a tight deadline. Working with fellow students across different skill levels taught us how to explain technical concepts clearly, give constructive feedback, and trust each other's contributions to ship something we're all proud of.

What's next for Diabete Prediction Web App

Fully develop a privacy policy feature that handle users's sensitive information based on the law of their states, and fully develop AI chatbox feature

Share this project:

Updates