About the Project
π Inspiration
Engineering teams rely on hundreds of daily GitHub actionsβcommits, pull requests, reviews, issues, comments, QA cycles. But converting all that activity into actionable workflows, insights, and SOPs still requires manual effort, tribal knowledge, and hours of documentation.
We wanted to build an AI system that can:
Observe** how engineering teams actually work Understand** bottlenecks and inefficiencies Learn** from historical workflows and SOPs
- Generate accurate, actionable operating procedures
- Store everything in a production-ready content system
The idea felt powerful:
What if your engineering team had a Production Copilot that could think, learn, and document your processes for you?
This is how Dev-Copilot was born.
π§ What We Built
Dev-Copilot is a production-grade multi-agent system that transforms engineering activity into structured, validated SOPs.
Our system:
- Ingests real engineering data (GitHub PRs, issues, reviews, timelines)
- Analyzes bottlenecks using statistical metrics
- Retrieves relevant knowledge with Redis Vector Search (RAG)
- Generates a complete SOP using Claude 3.5 Sonnet
- Evaluates SOP quality (completeness, actionability, consistency)
- Refines SOP automatically if scores are low
- Stores everything in Sanity as version-controlled reports
It acts like an AI engineering manager, documenting the workflow for you.
ποΈ How We Built It
π¦ 1. Multi-Agent Architecture
We designed a pipeline of specialized agents:
- GitHub Agent β fetches real engineering data
- Bottleneck Agent β computes workflow metrics
- Cache Agent β semantic caching using embedding similarity
- RAG Agent β retrieves similar SOP examples from Redis
- SOP Generation Agent β produces structured procedures with Anthropic
- Evaluation Agent β scores completeness, consistency, actionability
- Refinement Agent β improves SOP if quality is low
- Persistence Agent β stores SOP in Sanity CMS
These agents communicate through a shared AgentContext, passing structured data from one stage to the next.
π₯ 2. Retrieval-Augmented Generation (RAG)
We used Redis Vector Search to store:
- historical SOPs
- documentation examples
- engineering guidelines
Embedding retrieval looked like:
similarity = cos(E_query, E_doc)
These retrieved snippets give Claude richer grounding and reduce hallucinations.
π© 3. SOP Quality Scoring
We built a custom SOP Scoring Engine that evaluates:
- Completeness (0β100)
- Consistency with metrics (0β100)
- Actionability (# extracted steps)
This helped us refine SOPs automatically using a feedback loop.
π§ 4. Production-Ready Storage
We used Sanity CMS to persist:
- SOPs
- metric summaries
- bottleneck reports
- workflow diagrams
- version history
This allows teams to build dashboards or share SOPs instantly.
π‘ What We Learned
- How to build multi-agent intelligence beyond simple LLM calls
- Best practices for RAG and semantic memory
- How to structure production APIs with FastAPI
- How to integrate Redis Vector Search effectively
- How to translate unstructured GitHub events into insights and procedures
- How to design self-evaluating and self-refining agent loops
Most importantly, we learned how to turn raw engineering noise into clean, intelligent, and actionable workflows.
β οΈ Challenges We Faced
πΈ Designing agent-to-agent context flow
Ensuring that each agent receives structured data and passes back a structured context was harder than expected.
πΈ RAG signal quality
We had to tune embeddings, chunk sizes, and search thresholds to avoid irrelevant retrieval.
πΈ SOP consistency scoring
Building an evaluation engine that can detect missing sections and inconsistent SLAs required multiple regex and semantic checks.
πΈ GitHub data variability
Different repositories structure issues differently, so normalization became necessary.
πΈ Time constraints
Designing a full multi-agent OS in under 24 hours demanded strong prioritization and architectural discipline.
π Whatβs Next
- Add real-time GitHub webhooks for continuous SOP updates
- Add Slack notifications for bottlenecks
- Build a visual workflow dashboard
- Add advanced diagram generation
- Deploy fully on AWS
- Add multiple agent personalities (QA agent, DevOps agent, PM agent)
If you want, I can also write:
β¨ The full Devpost submission (All sections) β¨ The demo video script β¨ The project README β¨ The technical architecture diagram
Just tell me!
Log in or sign up for Devpost to join the conversation.