How we built it
In our interpretation of the challenge statement, we began our search for datasets with the intent of correlating the recovery of the travel and tourism industry in mid-late 2021 with regional nuances under the context of COVID-19. Beyond infection rates, we determined that a potentially pivotal factor linked to public travel sentiment was vaccine hesitancy, recorded at county and local levels within the United States.
We accessed both the “Vaccine Hesitancy for COVID-19: County and local estimates” and “United States COVID-19 Cases and Deaths” datasets provided by the Centers for Disease Control data catalog, which detailed estimates of infection rates, vaccine hesitancy, and social vulnerability for each county and state in the country.
We also analyzed our primary source of travel data from the Bureau of Transportation Statistics, which provided various metrics on flights from every U.S. airport by month from October 2018 to October 2021.
Finally, we utilized an assortment of minor datasets retrieved from the St. Louis Federal Reserve Economic Data collection. These datasets included general metrics regarding airline load factors, available seat mileage, and producer price indexes that would be useful in identifying additional indicators, contraindications and various contributing factors.
Using these sources of data, we developed correlations and modeled relationships between Covid-19 and the decline and subsequent recovery of the transportation and travel industry.
Challenges we ran into
It took a lot of time to discover useful and usable datasets that would lead us to a meaningful conclusion. After obtaining these datasets, we ran into problems with proper visualization and regression models, which we eventually were able to overcome despite some labeling errors in extremely long datasets.
Log in or sign up for Devpost to join the conversation.