Analyzing a country's literacy rate

Inspiration

I recall coming across an article that discussed the correlation between higher literacy rates and healthier populations along with increased employment rates. Higher literacy becomes a form of liberation as it empowers people. Once I read the description of the Cornucodia Hackathon to built a project related to social good, I knew I had to to focus on analyzing diverse literacy rates among countries and presenting meaningful insights through data visualization.

What it does

The Streamlit Web Application comprises two main components: an interactive map and a predictor for a literacy rate model. The interactive map showcases fifteen countries drawn from the Kaggle Literacy Rate dataset utilized in this project. It displays markers indicating the geographical locations of these countries. The intention is to enable users to utilize the interactive map to locate which countries require advocacy for increased educational resources. Additionally, the literacy rate model serves as an interactive machine learning tool, allowing users to manipulate various parameters (such as region, country, age, etc.) and observe how the literacy rate is modified in response to changes in these parameters.

How we built it

Literacy Rate Model -- I preprocessed the literacy rate dataset from Kaggle through Google Colaboratory, and the XGBoost Regression algorithm, powering the model, was implemented in the same Colab environment. Moreover, the Streamlit commands for crafting the web application and incorporating the user interface were executed by installing the Streamlit package within a virtual environment. The implementation took place using the Spyder IDE.

Interactive Map - Every country's latitude and longitudinal coordinates along with the country's name and literacy rate were stored. The Streamlit-Folium package was used to build an interactive map from the coordinates. Additional parameters were used to add design to the popups on the Map.

Challenges we ran into

Finding the best parameters for the XGBoost Algorithm was a bit difficult; I had to carefully make sure that the training data was not spilling into the testing data to promote a false accuracy.
Constructing the interactive map posed an initial challenge. Unaware of the Streamlit-folium package, I had to create some basic examples initially. Additionally, my strong desire for users to engage with the map and experience popups led me to encounter numerous errors before finally achieving success..
Importing the Python files to GitHub and deploying the model to Streamlit Share caused certain problems. Since the requirements.txt file was necessary for Streamlit to read the necessary packages, I ran into problems where one version of sklearn was not compatible with the XGBoost package.

Accomplishments that we're proud of

I'm proud of creating an interactive web model that users can interact with, and not just in a GitHub folder or a Colab file.
I'm also proud of fixing some of the errors in preprocessing the original dataset before feeding it into the model.

What we learned

Troubleshooting and problem solving
Social Good and Advocacy: I had to research every country's literacy rate, so I gained a deeper understanding in what resources are needed to improve the educational opportunities.

What's next for Analyzing a country's literacy rate

Definitely adding more countries to the interactive map! I think as more data gets added, the model can become more accurate and the interactive map can contain more information!

Built With

numpy
python
streamlit

Updates

Private user started this project — Nov 28, 2023 04:58 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.