Inspiration

We wanted to explore the adverse effects in healthcare treatment for hypertension, pain, and depression as they are largely non-physical conditions. These diseases also had specific demographics, so we were interested in evaluating whether the demographics affected by adverse effects matches with the overall demographic affected by the conditions.

What it does

Our code parses data from the FDA adverse effects database into CSV files and then analyzes the data with modeling to evaluate correlations between demographic information, seriousness, and adverse effects.

How we built it

We implemented code in Google Colaboratory to parse the data we wanted to take out from the JSON output of API calls from the FAERS database. We then implemented KNN classifiers, support vector machine, linear regression, and logistic regression modeling to evaluate correlations with SKLearn. We implemented principal component analysis to further assess correlations between different variables.

Challenges we ran into

The database that we pulled from had several missing fields and did not clearly evaluate the impact of adverse effects with a quantifiable metric.

Accomplishments that we're proud of

We overcame the challenge of filtering the database to account for the rows of missing data and learning modeling methods to study correlations.

What we learned

We learned a lot about applying data science methods to Python with SKlearn for advanced modeling and correlations from a data set.

What's next for Demographic vs Seriousness: Depression, Pain, & Hypertension

We would like to use natural language processing on resulting conditions to analyze how depression, pain, and hypertension result in the severity of adverse effects experienced.

Built With

  • colaboratory
  • openfda-api
  • python
  • sklearn
Share this project:

Updates