Big Data Analytics Project

Team Members-

Name Net ID University ID
Vaibhav Aggarwal va771 N15019899
Ginni Malik gm1908 N18379090

Project Choice – We have selected the project area – “exploring NYC Taxi Trips”. We would use the following datasets –

  • Yellow/Green Taxi Data
  • Taxi Data
  • Uber Data
  • CitiBike Data
  • Demographics

We are also planning to work with Stock Data and study various patterns.

Tasks to be carried out –

  • Analyzing the taxi data to find statistical models, like Regions with large number of drop offs at morning peak hours to find work areas.
  • To give other statistics such as time interval of the day when maximum taxis are used, areas with least taxi drop offs.
  • Most popular taxi destinations such as tourist places.
  • Taxi usage distribution over different seasons.
  • Analyze Weather Information and co-relate with Taxi usage
  • Mode of transport preferred by residents, eg. citibikes, taxis.
  • Studying the Distances for Uber, NYC taxi, and CItibike. By doing this we may find a pattern such as for small distances residents prefer Citibike.
  • Studying the stock prices for NYC cabs and Uber to find any pattern or impact of on each other.
  • To create a website to visualize the data


Date Details
April 11 Submitting Project Proposal, Google doc and creating Github repo
April 18 Performing analyses and revising the initial plan accordingly and also develop new ideas while doing analyses
May 2 To create a web app for User Interface so that data analyzed can be visualized interactively
May 14 Testing the analyses and visualization and submitting the final report
May 17 Presenting the project analyses
Share this project: