All the submissions are uploaded to Github and submitting the Github link here.
Code for Venezuela
Due to profound policy uncertainty and economic turbulence, millions of Venezuelans suffer from food shortage, a shortage of basic supply and medical care. It is so severe that, more than 3 million Venezuelans have left the country which is nearly 10% of the whole population. While the world community may debate the causes, there is no disputing that Venezuela is experiencing a serious humanitarian crisis.
Doctors For Health is a network of medical professionals in Venezuela that have gathered to rise to the challenges related to lack of data about the public health system. They are independently collecting information about Venezuela’s health system. They have been gathering data to fill the current information gap and generate solutions to improve the health of the general population. The data collected consists of weekly reports submitted by their network of doctors across Venezuela.
Challenge
Team Paz was assigned with the task of framing the business questions for the general public and the press, cleaning & wrangling the data, and tailoring the dashboard to the audience’s needs.
Data Quality Issues
• The most difficult and time-consuming issue faced in the data was translating the language from Spanish to English (As the team didn’t have any Spanish speakers).
• The Google Form had some questions which had check boxes for answers which allowed the user to input multiple answers where the response was intended to be only one.
• The data had many missing values. There were some questions on the Google Form which were not mandatory. This could be one of the reasons for multiple missing values.
• These forms were filled manually hence this can induce human error or biases in the data collected.
• Dropping columns – The columns important for the analysis (The General Public and Press) is only considered.
• Unable to clean data row wise – The rows indicated duplicate values, for example the Hospital BOL000 often showed double entries for every reporting week with different values.
Data Cleaning
• Track 1 was chosen for analysis and the Metrics was chosen carefully to cater to the audience (General Public and Press)
• The columns were converted manually from Spanish to English using ‘.replace()’ in Python.
• Missing values was filled if there were columns indicating a similarity. For example, columns like Renal replacement therapy availability and Renal replacement therapy operability are similar, where missing values can be filled in one column based on the value in the previous column.
• The questions that had a ‘yes’ (‘Si’) or ‘No’ answer was converted to 1 and 0 respectively.
Data Wrangling
•The metrics important to cater to the audience was identified. These includes the region, hospital code, if the hospital is approachable for an emergency or not, etc.
•The hospitals were considered in two different categories such as emergency services and regular services. The metric considered for the analysis is based on the basic requirement a hospital must have such as emergency supplies and ability to perform minor surgeries such as Appendicitis for emergency cases.
Github link : https://github.com/debasishbiswal/Guide-to-Health

Log in or sign up for Devpost to join the conversation.