Inspiration
Inspired by e-commerce losses from undetected anomalies in recommendation systems, like latency spikes causing cart abandonment. Aimed to automate monitoring using Vertex AI and Datadog for faster remediation.
What it does
Detects anomalies in Vertex AI recommendation engines, such as latency spikes (>200ms), model drift, and fraud patterns. Streams telemetry to Datadog, visualizes via dashboards, triggers rules for incidents with context for engineers
How we built it
Used Vertex AI for recommendation engine with Gemini models. Integrated Datadog API for metric streaming, dashboard creation, and rule-based alerts. Simulated data with Python, Scikit-Learn for detection, deployed on Google Cloud and Vercel(demo).
Challenges we ran into
Integrating real-time telemetry from Vertex AI to Datadog; handling noisy data for accurate anomaly rules; simulating realistic e-commerce anomalies without production access.
Accomplishments that we're proud of
Built end-to-end observability for LLM apps; achieved low false positives in detection; created actionable dashboards that reduce response time by 50%.
What we learned
Deepened knowledge in Vertex AI telemetry, Datadog workflows, and anomaly detection techniques; importance of thresholds in balancing alerts and noise.
What's next for AI-Powered Anomaly Detection for E-Commerce
Add ML-based adaptive thresholds; integrate with more partners; expand to fraud detection in transactions; open-source components for community contributions.
Built With
- css3
- datadog
- datadog-api
- fastapi
- github
- google-cloud
- html5
- javascript
- jinja
- numpy
- pandas
- python
- scikit-learn-(isolation-forest)
- uvicorn
- vercel
- vertex
Log in or sign up for Devpost to join the conversation.