Inspiration
We wanted to explore the adverse effects in healthcare treatment for hypertension, pain, and depression as they are largely non-physical conditions. These diseases also had specific demographics, so we were interested in evaluating whether the demographics affected by adverse effects matches with the overall demographic affected by the conditions.
What it does
Our code parses data from the FDA adverse effects database into CSV files and then analyzes the data with modeling to evaluate correlations between demographic information, seriousness, and adverse effects.
How we built it
We implemented code in Google Colaboratory to parse the data we wanted to take out from the JSON output of API calls from the FAERS database. We then implemented KNN classifiers, support vector machine, linear regression, and logistic regression modeling to evaluate correlations with SKLearn. We implemented principal component analysis to further assess correlations between different variables.
Challenges we ran into
The database that we pulled from had several missing fields and did not clearly evaluate the impact of adverse effects with a quantifiable metric.
Accomplishments that we're proud of
We overcame the challenge of filtering the database to account for the rows of missing data and learning modeling methods to study correlations.
What we learned
We learned a lot about applying data science methods to Python with SKlearn for advanced modeling and correlations from a data set.
What's next for Demographic vs Seriousness: Depression, Pain, & Hypertension
We would like to use natural language processing on resulting conditions to analyze how depression, pain, and hypertension result in the severity of adverse effects experienced.
Built With
- colaboratory
- openfda-api
- python
- sklearn
Log in or sign up for Devpost to join the conversation.