Inspiration
The rise of IoT and sensor-driven applications highlighted the need for a system that can process data streams in real-time, detect anomalies quickly, and make insights accessible in natural ways. We wanted to combine vector search, anomaly detection, and conversational AI into a single pipeline that not only monitors but also interacts with users seamlessly through Slack.
What it does
- Ingests streaming sensor readings (simulated with Python).
- Stores records in TiDB with 1536-dimension embeddings for semantic similarity.
- Detects anomalies based on bootstrap rules and cosine distance thresholds.
- Triggers Slack alerts instantly when anomalies are detected.
- Provides a Slack bot where users can request visualizations, KPIs, or insights.
- Uses LLMs to classify queries into semantic or direct SQL, and returns results as charts or numbers.
How we built it
- Simulated streaming data using Python to mimic real-time sensor readings.
- Ingested the data into TiDB, generating embeddings and creating an HNSW-based vector index for fast similarity searches.
- Defined bootstrap rules for anomaly detection and built logic to compare embeddings with cosine distance.
- Integrated Slack for real-time alerts and built a Slack bot for interactive queries.
- Leveraged LLMs for query classification, embedding generation, and semantic similarity checks to retrieve results from TiDB.
- Delivered results back to Slack in user-friendly formats like charts and KPI summaries.
Challenges we ran into
- Designing bootstrap rules that are flexible yet effective for anomaly detection.
- Optimizing vector similarity search for both speed and accuracy using high-dimensional embeddings.
- Handling real-time ingestion and ensuring anomaly alerts are triggered with minimal latency.
- Seamlessly integrating Slack with both alerting and conversational query workflows.
- Ensuring that the LLM correctly classifies queries and generates meaningful responses.
Accomplishments that we're proud of
- Built an end-to-end pipeline combining streaming ingestion, vector search, anomaly detection, and conversational analytics.
- Successfully integrated TiDB’s vector index with HNSW to perform efficient similarity checks on embeddings.
- Created a Slack bot that not only delivers anomaly alerts but also answers user queries with charts and KPIs.
- Demonstrated a scalable solution that can be extended to real-world IoT and monitoring use cases.
What we learned
- How to leverage TiDB’s vector index for real-time semantic similarity searches.
- Best practices for embedding generation and anomaly detection using cosine distance.
- Practical integration of LLMs for query understanding and classification.
- The importance of user experience—delivering insights directly in Slack makes the system far more accessible.
What's next for Real-Time Data Monitoring & Interactive Analytics with TiDB
- Expand the anomaly detection framework with adaptive thresholds and ML-based models.
- Integrate real-world sensor data streams instead of simulation.
- Enhance the Slack bot with richer visualizations and natural language explanations.
- Add support for more collaboration platforms (e.g., Teams, Discord).
- Scale the architecture to handle larger datasets and higher ingestion rates.
Built With
- flask
- llm
- natural-language-processing
- pysql
- python
- slack
- tidb

Log in or sign up for Devpost to join the conversation.