Inspiration
Our project is inspired by a project named Order Delivery Microservice: https://github.com/kbastani/order-delivery-microservice-example We want to accomplish Event Stream Processing, Change Data Capture (CDC) and Real-time Analytics Dashboards.
What it does
Real-Time Taxi HotMap is a web application that harnesses the capabilities of MongoDB Atlas, Google's Pub/Sub, and BigQuery to visualize real-time taxi pickups and drop-offs on an engaging and interactive heatmap. By capturing and processing real-time changes in taxi data, our application provides users with dynamic insights into taxi movement patterns. The heatmap offers immediate visibility into areas of high demand, popular drop-off locations, and traffic patterns, empowering users to make data-driven decisions.
How we built it
Real-Time Taxi HotMap was built using a microservices architecture. We developed the backend using Node.js, leveraging MongoDB Atlas to capture real-time changes in taxi data. Google's Pub/Sub served as a reliable and scalable message broker, streaming the MongoDB change streams. The data was then loaded into BigQuery for efficient aggregation and analysis. On the frontend, we utilized a modern JavaScript framework to fetch and render the real-time taxi data on the heatmap.
Challenges we ran into
- To make sure the data to be accurated. We tried to use google map api to estimate the trip duration between pickup location and dropoff location. Since we are building a realtime hotmap, the simulator keeps running. The google map api request reached the daily limit and a potential violation of our Acceptable Use Policy has been detected. The problem solved by using the Haversine formula to estimate the trip duration rather than using google map api.
- Limited storage capacity with MongoDB Atlas Free Tiers: MongoDB Atlas free tiers imposed storage limitations, preventing us from storing all past data. We needed a data warehousing solution to address this constraint.
- Data bottleneck when streaming and aggregating in MongoDB Atlas: Streaming and aggregating data simultaneously in MongoDB Atlas resulted in a data bottleneck. We sought a solution to enable efficient processing and analysis of real-time taxi data.
- Introducing CDC and BigQuery as a solution: To overcome the storage and data bottleneck challenges, we implemented Change Data Capture (CDC) and utilized BigQuery. CDC captured and replicated only data changes, reducing storage requirements. BigQuery facilitated efficient aggregations and insights from real-time and historical data.
Accomplishments that we're proud of
What we learned
Throughout the development process of Real-Time Taxi HotMap, we gained valuable insights and experiences that shaped our understanding and expertise: Effective utilization of limited storage capacity: Working with MongoDB Atlas free tiers, we learned how to efficiently manage and leverage limited storage capacity. We explored strategies to optimize data storage, ensuring that we captured and retained the most relevant information within the constraints. Implementation of Pub/Sub and CDC for real-time data processing: By integrating Google's Pub/Sub and implementing Change Data Capture (CDC), we deepened our understanding of real-time data processing. We learned how to capture and replicate data changes efficiently, enabling real-time updates and enhancing performance. Leveraging BigQuery for efficient data analysis: Utilizing BigQuery, we discovered the power of a robust data warehousing and analysis platform. We learned how to structure and query data effectively, enabling us to perform efficient data analysis, aggregations, and gain valuable insights from real-time and historical data. Importance of optimization in performance: Dealing with large volumes of real-time data streams, we encountered performance optimization challenges. Through careful analysis and experimentation, we learned techniques to optimize the processing and handling of real-time data, ensuring smooth performance and responsiveness. Collaborative teamwork and effective communication: Developing Real-Time Taxi HotMap as a team taught us the importance of collaborative teamwork and effective communication. We recognized the value of clear communication, streamlined coordination, and shared responsibilities in achieving our goals efficiently. The knowledge and experiences gained from these lessons will undoubtedly guide us in future projects, ensuring the successful implementation of real-time data processing, efficient data analysis, and collaborative teamwork.
What's next for Real-Time Taxi HotMap
We want to introduce Chat2Query to our project in the future. It allows users to interact with our BigQuery using conversational queries. It leverages the power of artificial intelligence to bridge the gap between human language and structured data, making it easier for users to query information about New York Uber trips without having to learn complex query languages. we have an exciting roadmap for Real-Time Taxi HotMap, focused on enhancing the user experience and unlocking the full potential of the stored data in BigQuery:
Search bar functionality: Our next major feature addition will be the integration of a search bar above the hotmap. This search bar will enable users to enter natural language queries, providing a user-friendly way to explore specific taxi pickup and drop-off patterns based on their preferences.
AI-driven query translation: To enable efficient querying and data aggregation, we plan to leverage the power of AI-driven natural language processing. By integrating either OpenAI or Vertex AI, we will utilize their capabilities to translate the user's natural language queries from the search bar into database queries. This AI-driven translation will facilitate seamless interaction with the application and enable users to express their queries in a more intuitive manner. Continuous performance optimization: We remain committed to continuously optimizing the performance and scalability of Real-Time Taxi HotMap. Our focus will be on fine-tuning the system to handle larger volumes of real-time taxi data, ensuring smooth responsiveness, and delivering a seamless user experience.
WebSocket integration for real-time simulation: In order to simulate more realistic real-time data, we will incorporate WebSocket into the simulator and backend. This enhancement will enable a more dynamic and lifelike simulation of real-time taxi data, further enhancing the overall realism of the application.
These planned enhancements, including the AI-driven query translation, data aggregation in BigQuery, and WebSocket integration, will empower users to effortlessly explore real-time taxi data and uncover valuable insights. We are excited about the future of Real-Time Taxi HotMap and its potential to revolutionize the way users interact with and analyze taxi data.
Log in or sign up for Devpost to join the conversation.