Inspiration

In today's fast-paced manufacturing environment, machine downtime can lead to massive productivity loss. We were inspired to build MachInsight AI to empower industries with a real-time, intelligent monitoring system that not only detects potential faults in advance but also enables users to interact with their sensor data through natural language. Our goal was to bring together the best of data warehousing, machine learning, vector databases, and AI agents into a single unified platform.


What it does

MachInsight AI is a real-time IoT monitoring and diagnostics platform powered by AI. It:

  • Ingests real-time sensor data into BigQuery.
  • Aggregates data hourly and predicts fault probability using a Random Forest model.
  • Stores results and embeddings in MongoDB.
  • Builds a semantic search index with MongoDB’s vector search.
  • Provides an AI agent built using Google ADK and Gemini 2.0 Flash that answers queries like:
    • “Are there any machines with temperature greater than 45 today?”
    • “What’s the last time Machine 2 turned faulty?”
    • “Find machines similar to Machine 5’s fault behavior.”

How we built it

  • Data Ingestion: Sensor data is ingested in real time into BigQuery.
  • Data Processing & ML: Vertex AI Colab notebooks compute hourly summaries and run a trained Random Forest model to predict fault probability.
  • Storage & Embeddings: Results are stored in MongoDB; a second notebook creates embeddings and stores them in a vector index.
  • Agent & Tools: A Google ADK agent is configured with custom tools like:
    • machines_with_high_metrics
    • last_faulty_time
    • faulty_machines_summary
    • plot_machine_trend
    • find_similar_machine_events
  • Scheduling: GCS and cron jobs help schedule the notebooks in the correct order.

Challenges we ran into

  • Notebook Scheduling: Ensuring notebooks execute in the right sequence (snapshot → ML → embeddings) required custom orchestration with GCS and cron syntax.
  • Natural Language Flexibility: Parsing vague or flexible time-based user queries into accurate MongoDB queries took significant regex, date parsing, and validation effort.
  • Vector Search Fine-Tuning: Creating meaningful embeddings and tuning vector similarity thresholds for useful results took several iterations.
  • Access Issues: Faced signed URL and permission errors while loading charts and accessing GCS files programmatically.

Accomplishments that we're proud of

  • Built a seamless pipeline from data ingestion to AI agent interaction.
  • Created a modular agent with plug-and-play tools to analyze, visualize, and reason over sensor data.
  • Leveraged multiple technologies (BigQuery, Vertex AI, MongoDB, ADK, Gemini) to work in harmony.
  • Delivered a fully working prototype with real-time response capabilities for predictive maintenance.

What we learned

  • How to build and schedule machine learning pipelines in Vertex AI with Colab Enterprise.
  • Integration of MongoDB vector search for semantic machine event comparison.
  • How to use Google ADK to create agents with custom tools and enable multi-turn dialogue with contextual awareness.
  • Best practices for data modeling and real-time analytics on IoT sensor data.

What's next for MachInsight AI - Machine Sensing and Monitoring AI Agent

  • Integrate anomaly detection using autoencoders or LSTMs.
  • Add support for image/video sensor data for visual inspections.
  • Extend the agent to trigger automated actions like maintenance alerts or scheduling repair tasks.
  • Offer a user-facing dashboard powered by Looker Studio and agent chat widget.
  • Explore Edge AI deployment to reduce latency and bandwidth usage.

Built With

Share this project:

Updates