This data set was the most interesting to analyze at the Rice Datathon 2020.

What it does

This project uses data science to most accurately predict if a couple will get divorced or not based on a 54 question survey.

How I built it

We used Principal Component Analysis to reduce the dimensionality of the data set so that we could visualize the results and used a Support Vector Machine to linearly separate the divorced statistics from the not divorced ones.

Challenges I ran into

We used tkinter for the first time to build the python gui in order to interact with our machine learning framework

Accomplishments that I'm proud of

Our results indicate that this survey is a very good indicator of whether you will be divorced or not, with an accuracy percentage of 98.1%.

What I learned

What's next for Principal Indicators of Divorce

The future steps for Principal Indicators of Divorce would be to analyze a larger dataset to make sure that our prediction is accurate. We also may consider other information besides survey data (like biographical data), so that we can predict based on lifestyle in addition to this survey. Also, we can increase our accuracy by adding more questions to the survey based on questions we’ve pinpointed as critical.

Built With

Share this project: