Inspiration

Inspiration

MLOps teams spend 90% of their time on infrastructure setup instead of actual ML research. We saw data scientists struggling with:

  • 40% of time lost configuring SageMaker, IAM roles, and S3 buckets
  • Decision paralysis choosing from 100+ instance types and 500K+ Hugging Face models
  • No automatic budget enforcement leading to cost overruns
  • Deep AWS expertise required just to launch a training job

Our vision: What if you could just say "Train a NER model under $10" and have an AI agent handle everything?

What it does

llmops-agent is an intelligent, conversation-driven MLOps platform that automates the complete ML lifecycle through natural language. Give it a single sentence like:

"Train a Named Entity Recognition model on the ciER dataset. Budget: $10, Time: 1 hour, F1 > 85%"

And it:

  • ✅ Discovers and validates datasets
  • ✅ Selects optimal model architecture
  • ✅ Provisions cost-effective GPU infrastructure
  • ✅ Launches SageMaker training with LoRA optimization
  • ✅ Delivers: F1: 87.3%, Cost: $4.20, Time: 42 minutes

42 minutes from idea to production model. Zero infrastructure management. 58% under budget.

What it does

How we built it

Challenges we ran into

Accomplishments that we're proud of

What we learned

What's next for Untitled

Built With

  • chat-driven
  • fast
  • mlops
  • nvidia-nim-integration
  • real-time-streaming
  • sagemaker
Share this project:

Updates