Inspiration
What it does
In this project, our main goal is to explain the most important features to discriminate COVID cases from non-COVID cases. Although not having a lot of data, we can mark a possible good path for future analysis with more data.
How I built it
It has been done with Python and R. We performed a statistical test to find the most relevant features. Then, we developed two models sacrificing some accuracy for the shake of simplicity and explainability (RandomForest and Decision Tree).
Challenges I ran into
The data set is not balanced and there is too much unnecessary data. Besides, the data is too sparse. The main challenge has been cleaning properly the data.
Accomplishments that I'm proud of
We have faced a lot of problems with missing values and with and imbalanced dataset. Although this problems we have reached a final model.
What I learned
The data set is not balanced and there is too much unnecessary data. Besides, the data is too sparse. The main challenge has been cleaning properly the data.
What's next for Pedriatic age: Analysis of COVID-19 cases
Further steps should be done when a better dataset is available. Dataset should be a bit more balanaced to extract strong conclusions. With a better dataset it would be possible to get more insights about wich symptoms and environmental features are more relevant.
Built With
- colab
- jupyter
- python
Log in or sign up for Devpost to join the conversation.