As college students, we place great emphasis on healthy living, but those motives can get lost in the inevitable deluge of work. Clear visualizations of health data help us (and others across the country) recognize what keeps humans well and act to preserve our fitness. The CDC produces such consistently robust datasets that we were immediately drawn to BRFSS, and it complemented our group's love of physical activity.
What it does
Preliminarily, our program uses R to read in 14 years of CDC BRFSS data (2004-2017) amounting to roughly 6 million observations on 360 fields. It then selects what we deduced were the eight most relevant variables to healthy living, cleans the data, isolates the fields, and groups/averages them by year and state.
These data inform a web-based animation in which a user can choose the variable she wishes to visualize and watch as an animated map of the United States progresses through that variable (with state-level granularity) over time. The animation can be played, paused, or stopped, and descriptions accompany each variable.
As an added dimension (pun somewhat intended), we implemented a Unity-based 3D visualizer that reads our custom-generated QR codes, translates the encoded bytes, and applies the on-screen data in real time to an AR map of the United States.
How we built it
We began by observing, reading, cleaning, and analyzing the data. Once we had a clear direction, we started developing the web app while simultaneously refreshing our rusty knowledge of Unity. Once each individual piece was in close-to-working order, the QR code system was implemented to link the pieces together.
Challenges we ran into
First, preparing the CDC datasets for analysis was tedious. They were presented in fixed-width file (FWF) format, which is uncommon and difficult to work with, and the fields were encoded. Even worse, the encoding methodologies and field names were listed exclusively in PDFs and changed each year, which meant each year's dataset had to be analyzed in a totally different fashion. This lack of consistency was frustrating, particularly because it was often tantalizingly close to workable; some columns, for example, were only off by one or two positions out of 2,000+.
We also had a lot of trouble getting the frame rate up so that all 50 states could update at the same after parsing a QR code and still look pretty good. The other major challenge was trying to work in Vuforia for an AR element, which eventually did not pan out due to the complications of having multiple video streams for different sides of data processing.
Encoding and reading the QR codes proved somewhat challenging.
Accomplishments that we're proud of
We are extremely proud of BRFSS: The Most Important Meal of the Day; each of us was able to utilize our unique skills. The data-oriented among us got to wrangle massive datasets; the heavy coders produced advanced QR code algorithms; and the visualizers got to work in both two and three dimensions.
What's next for BRFSS: The Most Important Meal of the Day
We'd love to apply our model not just to health data but other kinds of information, making them more accessible via dual 2D and 3D animations. Our framework is applicable to any generic type of dataset with geographic boundaries, so it would also be cool to generalize further and build an ML model that could recognize the type of dataset and display it properly without requiring any hard-coding of certain fields.