“A picture is worth a thousand words.” Hawaii State’s Open Data’s goal is to “… provides residents, analysts, and civic developers with unparalleled access to State data for use in increasing transparency, driving civic innovation, and engaging participants in a more collaborative form of government.” However, much of the available data is in a raw state that is not aggregated or easily digestible. With data visualizations, we believe that users will derive more insight and value from the rich datasets that are provided. Being able to “see” the data will lead to better insight which will inevitably lead to better solutions for the community.
What it does
Our web application provides basic aggregation of raw data and visualization options in the form of Bar, Histogram, Pie, and Line charts.
How we built it
Python already has a lot of great data visualization and data handling libraries (e.g. pandas, matplotlib, plotly), so we decided to use it as the backend. Flask is a tool that we used to host the front end (i.e. the web server that calls back to pull up the visualization).
Specifically, we used pandas to load in the data and generate summary statistics, and then used plotly to make the pretty pictures.
How we actually got here, was we googled and found this link and worked off of it.
The design of the repository is that we developed a python package to handle data processing, and another package to handle the visualizations. And then the final piece for the actual webserver was developed separately. This allowed the team to work on different efforts and merge into a final product easily.
Challenges we ran into
Some of us had to learn for the first time how to use Git and GitHub to code in collaboration and see each others’ work. It was difficult at first trying to figure out how to create, checkout, and push branches, as well as learning new coding capabilities in python (e.g. pandas, plotly, and flask) But now we are much more comfortable with the different aspects of coding collaboration. For all of us, it is our first hackathon, so it is a learning experience all the way.
Another problem we dealt with was data aggregation issues with the given datasets. Before you can create visualizations, it is important to know what question you are trying to answer. Without knowing this, it was difficult to predict how to summarize certain fields. For example, a field called Gender (Female/Male) cannot be Summed, but you can Count the number of times ‘Female’ or ‘Male’ appeared. Moreover, some datasets (e.g. Pesticide Products, View Tax Plat Maps) did not contain data that could be visualized.
Accomplishments that we're proud of
We're proud to have completed a project, since this is our first hackathon. We have learned a lot through this experience, and we had to try new things in order to follow the challenge requirements. Creating a web app for the first time using flask is something new that we have tried, and we are proud of how far we were able to come since the beginning of this competition.
What we learned
Throughout the process, we were able to ask teammates questions when faced with a problem and learn how to effectively troubleshoot coding or technical issues. We learned different aspects of working with Git and Github as well as the vast amount of functionality in the python libraries. Also we discovered how to best approach data visualization (which columns or values are the most helpful, how data tells a story, and how visualization exists for answering users questions.) Also, Googling answers is your friend.
What's next for SquidBrainCharts
Things that we wanted to implement but didn’t get around to:
- Using the CKAN API to pull the data with more information
- Handling more types of data formats
- Adding the ability to correlate between data sources
- Making charts more customizable (colors, titles, etc.)
- Hardening the code and making it more robust (it’s rather fragile right now and errors aren’t gracefully handled)
- Providing suggestions for visualizations based on the data type (using expert opinions)
- Generating a unique url for the data visualization for the user to copy
Log in or sign up for Devpost to join the conversation.