Project Story

Inspiration: When brainstorming project ideas, we wanted something that would be both enjoyable to work on and mutually beneficial. We decided to focus on a project that could help a company gain insights from a small amount of data, while still allowing us to expand out analysis and conduct our own research. This led us to chose Stratscratch, as it gave us the opportunity to familiarize ourselves with how companies operate using limited data

How does it work: This dataset helps us identify the key indicators that significantly impact whether a person who signs up with Uber will go on to start driving. It also allows us to dive deeper into understanding how Uber can use these insights to target the right individuals—based on the factors that most influence activation—ultimately improving driver onboarding and retention strategies.

Built-in Foundation: We built this model using various predictive algorithms, including Random Forest Classifiers and XGBM (LightBoost), to evaluate accuracy and test different variable indicators that influence the dependent variable, started_driving. After identifying which features had the most significant impact and comparing model performance, we conducted further Exploratory Data Analysis (EDA) to understand why these indicators matter and what broader implications they may have beyond the dataset.

Challenges: Our biggest challenge was manipulating and handling the data. We encountered a significant amount of missing values and had to decide whether to replace them with binary values (0 or 1) or use the median to potentially improve model accuracy. On top of that, we also had to engineer and add new columns to the dataset to enhance our analysis and make the insights more meaningful.

Accomplishments: One of our proudest accomplishments was collaborating effectively as a team to brainstorm and develop a solution that could genuinely benefit the company. We’re especially proud of the visualizations and predictive models we created, which clearly represent our findings and offer valuable insights to help Uber recruit and retain more drivers.

Learning Outcome: Overall, we learned how to apply different predictive models to identify which variables significantly impact whether a person will start driving. We also gained hands-on experience with Power BI, where we learned how to create effective data visualizations to support our presentation. Most importantly, we learned how to tell a compelling story with data so that our audience can clearly understand our analysis and insights.

What's next for the Unforeseen: The next step for the Unforeseen could involve comparing our dataset with a competitor’s—such as Lyft—if external data is available. This would allow us to analyze and contrast the significant factors influencing driver activation across both platforms. From there, we could identify key differences and suggest improvements that may help Uber better understand what motivates drivers to start driving.

Built With

Share this project:

Updates