🔍 Inspiration
As AI adoption accelerates, engineering teams are flying blind — they have no visibility into what their LLMs are actually doing in production. How much are we spending? Which model is faster? Why did that agent fail? I built LLMWatch to answer these questions.
🏗️ What I Built
LLMWatch is a full-stack B2B LLM observability and orchestration platform featuring:
- Multi-Model Routing — Switch between self-hosted Qwen3.5-35B (via vLLM on AWS EC2) and Google Gemini 3 Flash with a single toggle
- Real-Time Analytics — Track cost, latency, request volume, and error rates live
- Reasoning Mode — See the LLM's chain-of-thought alongside responses
- Autonomous ReAct Agent — 4 tools (web search, code execution, DB query, doc analysis) with real-time SSE streaming
- Agent Trace Viewer — Full execution traces stored in DynamoDB with timeline visualization
- MLFlow Integration — Every LLM call logged for experiment tracking and model comparison
- Multi-Tenant Security — JWT auth with company-scoped data isolation
⚙️ How I Built It
Backend: FastAPI + LangChain + MLFlow + AWS DynamoDB + vLLM
Frontend: React 19 + TypeScript + TailwindCSS v4 + shadcn/ui + Framer Motion
Infrastructure: AWS EC2 (GPU for Qwen) + DynamoDB + Docker + Nginx
AI Tools: Google Antigravity (Gemini 3.1 Pro + Claude Sonnet 4.6)
🚧 Challenges
- Implementing real-time SSE streaming for the ReAct agent while maintaining DynamoDB trace storage
- Self-hosting Qwen3.5-35B-A3B on EC2 with vLLM and 4-bit quantization
- Building a multi-tenant architecture where every query is company-scoped
- Completing a production-grade full-stack platform solo in 24 hours
📚 What I Learned
- MLOps patterns for production LLM deployments
- LangChain ReAct agent architecture with custom callback handlers
- AWS infrastructure design for AI workloads
- The power of AI-assisted development with Google Antigravity
Built With
- aws-dynamodb
- aws-ec2
- docker
- fastapi
- framer-motion
- google-gemini-api
- jwt
- langchain
- mlflow
- nginx
- pydantic
- python
- qwen3.5
- react
- recharts
- shadcn/ui
- tailwindcss
- typescript
- vllm
Log in or sign up for Devpost to join the conversation.