Pedriatic age: Analysis of COVID-19 cases

Inspiration

What it does

In this project, our main goal is to explain the most important features to discriminate COVID cases from non-COVID cases. Although not having a lot of data, we can mark a possible good path for future analysis with more data.

How I built it

It has been done with Python and R. We performed a statistical test to find the most relevant features. Then, we developed two models sacrificing some accuracy for the shake of simplicity and explainability (RandomForest and Decision Tree).

Challenges I ran into

The data set is not balanced and there is too much unnecessary data. Besides, the data is too sparse. The main challenge has been cleaning properly the data.

Accomplishments that I'm proud of

We have faced a lot of problems with missing values and with and imbalanced dataset. Although this problems we have reached a final model.

What I learned

The data set is not balanced and there is too much unnecessary data. Besides, the data is too sparse. The main challenge has been cleaning properly the data.

What's next for Pedriatic age: Analysis of COVID-19 cases

Further steps should be done when a better dataset is available. Dataset should be a bit more balanaced to extract strong conclusions. With a better dataset it would be possible to get more insights about wich symptoms and environmental features are more relevant.