About the Project

πŸš€ Inspiration

Engineering teams rely on hundreds of daily GitHub actionsβ€”commits, pull requests, reviews, issues, comments, QA cycles. But converting all that activity into actionable workflows, insights, and SOPs still requires manual effort, tribal knowledge, and hours of documentation.

We wanted to build an AI system that can:

Observe** how engineering teams actually work Understand** bottlenecks and inefficiencies Learn** from historical workflows and SOPs

  • Generate accurate, actionable operating procedures
  • Store everything in a production-ready content system

The idea felt powerful:

What if your engineering team had a Production Copilot that could think, learn, and document your processes for you?

This is how Dev-Copilot was born.


🧠 What We Built

Dev-Copilot is a production-grade multi-agent system that transforms engineering activity into structured, validated SOPs.

Our system:

  1. Ingests real engineering data (GitHub PRs, issues, reviews, timelines)
  2. Analyzes bottlenecks using statistical metrics
  3. Retrieves relevant knowledge with Redis Vector Search (RAG)
  4. Generates a complete SOP using Claude 3.5 Sonnet
  5. Evaluates SOP quality (completeness, actionability, consistency)
  6. Refines SOP automatically if scores are low
  7. Stores everything in Sanity as version-controlled reports

It acts like an AI engineering manager, documenting the workflow for you.


πŸ—οΈ How We Built It

🟦 1. Multi-Agent Architecture

We designed a pipeline of specialized agents:

  • GitHub Agent β†’ fetches real engineering data
  • Bottleneck Agent β†’ computes workflow metrics
  • Cache Agent β†’ semantic caching using embedding similarity
  • RAG Agent β†’ retrieves similar SOP examples from Redis
  • SOP Generation Agent β†’ produces structured procedures with Anthropic
  • Evaluation Agent β†’ scores completeness, consistency, actionability
  • Refinement Agent β†’ improves SOP if quality is low
  • Persistence Agent β†’ stores SOP in Sanity CMS

These agents communicate through a shared AgentContext, passing structured data from one stage to the next.


πŸŸ₯ 2. Retrieval-Augmented Generation (RAG)

We used Redis Vector Search to store:

  • historical SOPs
  • documentation examples
  • engineering guidelines

Embedding retrieval looked like:

similarity = cos(E_query, E_doc)

These retrieved snippets give Claude richer grounding and reduce hallucinations.


🟩 3. SOP Quality Scoring

We built a custom SOP Scoring Engine that evaluates:

  • Completeness (0–100)
  • Consistency with metrics (0–100)
  • Actionability (# extracted steps)

This helped us refine SOPs automatically using a feedback loop.


🟧 4. Production-Ready Storage

We used Sanity CMS to persist:

  • SOPs
  • metric summaries
  • bottleneck reports
  • workflow diagrams
  • version history

This allows teams to build dashboards or share SOPs instantly.


πŸ’‘ What We Learned

  • How to build multi-agent intelligence beyond simple LLM calls
  • Best practices for RAG and semantic memory
  • How to structure production APIs with FastAPI
  • How to integrate Redis Vector Search effectively
  • How to translate unstructured GitHub events into insights and procedures
  • How to design self-evaluating and self-refining agent loops

Most importantly, we learned how to turn raw engineering noise into clean, intelligent, and actionable workflows.


⚠️ Challenges We Faced

πŸ”Έ Designing agent-to-agent context flow

Ensuring that each agent receives structured data and passes back a structured context was harder than expected.

πŸ”Έ RAG signal quality

We had to tune embeddings, chunk sizes, and search thresholds to avoid irrelevant retrieval.

πŸ”Έ SOP consistency scoring

Building an evaluation engine that can detect missing sections and inconsistent SLAs required multiple regex and semantic checks.

πŸ”Έ GitHub data variability

Different repositories structure issues differently, so normalization became necessary.

πŸ”Έ Time constraints

Designing a full multi-agent OS in under 24 hours demanded strong prioritization and architectural discipline.


🌟 What’s Next

  • Add real-time GitHub webhooks for continuous SOP updates
  • Add Slack notifications for bottlenecks
  • Build a visual workflow dashboard
  • Add advanced diagram generation
  • Deploy fully on AWS
  • Add multiple agent personalities (QA agent, DevOps agent, PM agent)

If you want, I can also write:

✨ The full Devpost submission (All sections) ✨ The demo video script ✨ The project README ✨ The technical architecture diagram

Just tell me!

Built With

Share this project:

Updates