Inspiration

In many low-resource regions, pregnant women with diabetes have very limited access to experienced obstetricians. Yet diabetes is a major risk factor for complications during pregnancy. This project explores whether simple, routinely collected clinical measurements can be used to estimate obstetric risk and support earlier referral to medical care.

What we built

We analyzed a dataset of pregnant diabetic women described by six physiological features (age, systolic and diastolic blood pressure, post-prandial glycemia, temperature, and resting heart rate) and a three-level obstetric risk label (0 = low, 1 = medium, 2 = high).

The project has three main components:

  1. Exploratory data analysis to understand feature distributions and relationships with risk.
  2. A binary classifier separating “no risk” vs “at risk” pregnancies.
  3. A multi-class classifier distinguishing low, medium, and high obstetric risk, deployed in a Streamlit web app where users input the six measurements and receive a colour-coded risk level (green / orange / red).

How we built it

We started with exploratory data analysis: histograms by class, pairplots, PCA and a correlation matrix to visualise the structure of the data.
Because the classes are imbalanced, we chose "balanced accuracy" as our main metric.

For the binary task, we evaluated several models:

  • k-Nearest Neighbours with 10-fold cross-validation to tune K
  • Logistic regression with L2 regularisation and grid search over λ
  • A non-linear model (Random Forest) with grid search over the number of trees and other key hyperparameters

Random Forests achieved the best balanced accuracy, so we selected them as our final model and then extended the approach to the full three-class problem. Finally, we exported the trained model and scaler with joblib and built a simple Streamlit interface around them.

What we learned

  • Balanced accuracy is more informative than raw accuracy when classes are imbalanced and strongly penalises trivial majority-class models.
  • Standardising features is crucial for distance-based algorithms like k-NN.
  • PCA-based dimensionality reduction is not always beneficial: for Random Forests, which already perform feature subsampling, reducing dimensions actually hurt performance in our case.
  • Even relatively simple models, when carefully tuned and evaluated, can deliver robust predictions suitable for decision support.

What challenges we ran into

One challenge was dealing with "class imbalance": a naive model that always predicts the majority class looks good in terms of accuracy but completely fails to detect high-risk pregnancies. Choosing the right metric and interpreting it correctly was essential.

Another difficulty was "hyperparameter tuning". Different grids for the number of neighbours in k-NN or trees in the Random Forest sometimes produced similar scores but very different “optimal” values, which forced us to think about model stability, computation time, and practical interpretability rather than just chasing the absolute best score.

We also had to be careful with "dimensionality reduction". PCA initially seemed attractive, but experiments showed that reducing to too few components degraded performance. Understanding why (loss of information vs. built-in feature subsampling in Random Forests) required a bit of iteration and debugging.

Finally, there are "ethical and practical challenges": false negatives are particularly dangerous in a medical context, and even false positives can create stress and strain limited healthcare resources. This reminded us that a model like this must be used as a decision-support tool, not as an autonomous diagnostic system.

Next steps

Future work could include probability calibration and more systematic evaluation of different decision thresholds to better control the trade-off between sensitivity and specificity. We could also integrate additional clinical or socio-economic variables, collaborate with healthcare professionals to validate the model prospectively, and extend the Streamlit app with explanations (e.g. feature importance) to make the predictions more transparent for clinicians and patients.

Built With

Share this project:

Updates