We aimed to try and target a specific hypothesis to explore in regards to Jet Blue's customer satisfaction. Having recently had flights canceled due to poor weather, we wanted to explore how an airline's delays or cancellations affected customer experience. We wanted to be able to quantify how much a delay affects customer satisfaction vs a cancellation. This metric could then be used as input into an airline's real-time decision support system to allow more effective decision making.
What it does
We scraped over 200k tweets and 175k Tripadvisor reviews and then ran it through Google's sentiment analysis API to get a normalized metric for JetBlue's customer satisfaction overtime on a day to day basis. We then wrote custom flight statistics scraping software to get the cancellations and delays for each carrier on a daily basis. We then did data analysis to determine how cancellations and delays affected customer satisfaction and wrote a simple front end that allows querying of Jet Blue customer tweets for specific keywords.
How we built it
Our whole project is written in Python. We used Google Firestore to store all of our scraped data, and used Selenium and PhantomJS for our custom flight statistics data acquisition. We also used Google Cloud Instances to run our scraping so we were not bombarding YHack's wifi network. We used Google's Natural Language Processing libraries to do Sentiment Analysis on our scraped text
Challenges we ran into
Accomplishments that we are proud of
Our sheer volume of data is impressive. Our high-performance Google Cloud instances with low latency allowed us to scrape a lot more data than we initially thought was possible. Also, our flight delay and cancellation statistics are a pretty uniquely crafted data source
What we learned
How to build high-performance data pipelines and scraping systems. How to build web apps with Flask in Python.
What's next for SSS — Sentiment Support System
As we collect more and more data, our statistical significance and observed trends improve in strength. We currently can only get flight information for the past 60 days, but with better we would be able to solidify our models even further. We would also love to do more fine-tuned analysis with keywords relating to specific parts of Jet Blue's customers experience.
Log in or sign up for Devpost to join the conversation.