Mental Health Risk Predictor Using AI and MongoDB

Inspiration

Mental health affects nearly every college student, yet it’s often overlooked. Our team wanted to use AI to generate predictions and help people better understand the mental health risks they might face during college.

What We Learned

We learned how to:

  • Clean and encode real-world mental health data from a Kaggle dataset
  • Apply machine learning models like Logistic Regression and Artificial Neural Networks (ANNs)
  • Use correlation analysis and permutation importance to interpret feature impact
  • Build an interactive dashboard where users receive AI-generated feedback based on their input
  • Store and search vectorized features in MongoDB for future recommendations or clustering analysis

How We Built It

  1. Dataset
    We used a public dataset from Kaggle containing student responses on mental health, stress, sleep quality, and social support.

  2. Data Cleaning

    • Encoded Yes/No values into 1 and 0
    • Converted school year to integer values
    • Dropped incomplete or noisy data entries
  3. Modeling

    • Built a Logistic Regression model (87% accuracy) and an Artificial Neural Network (55% accuracy)
    • Used permutation importance to identify which features had the strongest influence on mental health risks
  4. Visualization

    • Created a correlation heatmap to analyze feature relationships
    • Generated a confusion matrix to visualize model performance
  5. Frontend + Dashboard

    • Users input personal features like age, GPA, and major
    • The system returns a likelihood of mental health risk, using the ML model’s predictions and vectorized feature analysis stored in MongoDB

Challenges We Faced

  • Data Quality: The dataset had inconsistencies and missing entries, which required extensive cleaning and transformation
  • Model Accuracy: Our ANN struggled with performance, highlighting how smaller datasets can limit deep learning methods

What's Next?

  • Add support for real-time vector similarity search using MongoDB Atlas Vector Search
  • Expand the dataset with new sources and anonymized student inputs
  • Improve model accuracy and explore ensemble methods
  • Enable users to track mental health scores over time with secure data storage

Built With

Share this project:

Updates