Inspiration
Every developer knows this feeling: you have a great idea, spend hours searching GitHub for the best tools, weigh the pros and cons of several libraries, and then start from square one. Or worse, you ask an AI for help, and it confidently suggests tools that are outdated or don’t even exist.
RepoMind was created from one simple question: What if AI agents could research actual open-source software repositories, explain their choices like a seasoned engineer, and use proven tools instead of imaginary ones?
We set out to build an autonomous architect that doesn’t just write code—it understands the open-source software ecosystem, makes smart decisions, and shows its reasoning.
What it does
RepoMind is an autonomous AI architect that turns ideas or design sketches into production-ready code using smart open-source research.
Core Capabilities:
Vision Analysis
- Upload a wireframe or sketch (drag-and-drop or paste)
- Gemini Vision extracts UI components, user flows, and backend needs
- Transforms visual ideas into technical specifications
Intelligent OSS Research
- Multi-agent orchestration breaks your idea into research tasks
- GitHub Hunter: Searches over 100 million repositories for tried-and-true libraries
- NPM Scout: Finds packages based on real download statistics
- Grounding Engine: Uses Google Search to confirm modern best practices
- Uncovers tools you might not find on your own (sorted by stars and updates)
The "Why Engine" (Architecture Design)
- Analyzes research results to design your tech stack
- For EVERY component, explains:
- Why this tool was selected (features, community, star count)
- Runner-up options considered
- Why the runner-up was not chosen (trade-off analysis)
- Creates Mermaid.js architecture diagrams
Marathon Builder (Verified Code Generation)
- Produces starter kit files (e.g.,
package.json,README.md, core components) - Autonomous verification loop: Generate → Review → Fix → Verify
- Identifies syntax errors, missing imports, and logic issues
- Only delivers code that passes the AI code check
Live Agent Visualization
- Real-time console showing agent progress
- Task management with status tracking
- Open research process (see every discovered repository)
How we built it
Tech Stack
- Frontend: React 18, TypeScript, Vite
- AI Engine: Google Gemini API (
gemini-3-flash-preview) - Styling: TailwindCSS with custom glass-morphism design
- Diagrams: Mermaid.js for architecture visualization
- APIs: GitHub REST API, NPM Registry, Google Grounding
Architecture Highlights
1. Multi-Agent Orchestration
// Orchestrator breaks down user goal, worker agents execute tasks
const tasks = await geminiService.orchestratePlan(userInput, visionContext);
for (const task of tasks) {
const result = await geminiService.executeTask(task);
// Agents use function calling to access real tools
}
2. Function Calling for Real Tool Access
- Defined GitHub and NPM searches as Gemini function declarations
- Agents decide when to call external APIs
- Results are integrated back into the context for synthesis
3. Vision and Grounding Fusion
// Analyze sketch with Gemini Vision
const visionData = await analyzeSketch(imageBase64);
// Grounding enriches research with web sources
const { text, urls } = await deepResearch(query);
4. Marathon Verification Loop
// Generate, verify, and fix (up to 2 attempts)
let isValid = false;
while (attempts < 2 && !isValid) {
const code = await generateFile(filename, context);
const { valid, critique } = await verifyCode(code);
if (!valid) context += critique; // Self-correcting
}
5. Streaming UX with Real-Time Logs
- Agent transitions trigger updates in the UI
- Live console displays tool usage and research progress
- Glassmorphic design with animated transitions
Challenges we ran into
1. Preventing AI Hallucinations
- Problem: LLMs often suggest non-existent libraries like "react-super-awesome-lib"
- Solution: Function calling ensures agents only reference actual GitHub API responses
- Result: 100% real tool suggestions
2. Code Verification Without Running It
- Problem: It’s not safe to run arbitrary generated code in the browser
- Solution: Use a second Gemini instance as a "code reviewer"
- Challenge: Balancing thoroughness with speed (limited to 2 verification attempts)
3. Multi-Agent Coordination
- Problem: Parallel agents sometimes researched overlapping tools
- Solution: The orchestrator gives distinct, non-overlapping tasks to each agent
- Learning: Better prompts are more effective than complex coordination logic
4. Vision to Code Context Gap
- Problem: Sketch analysis can be subjective; it’s tough to translate "a card layout" into "use shadcn/ui"
- Solution: Vision output informs the research phase, enabling agents to find specific libraries that match visual patterns
5. Rate Limits and API Costs
- Problem: GitHub API has strict rate limits; Gemini token costs add up quickly
- Solution: Client-side caching, controlled searches, and efficient prompt design
- Trade-off: Gave up some parallelization to control costs
Accomplishments we’re proud of
Zero Hallucinations: Every suggested library is verified against real APIs
The "Why Engine": The first AI architect that explains trade-offs like a senior engineer
Sketch to Stack Pipeline: Smooth transition from vision analysis to research to code
Self-Correcting Code: The Marathon Builder detects and fixes its own errors
Transparent Reasoning: Users see every repository discovered and every decision made
Production-Ready Output: Generated code passes verification before delivery
Shipped in 72 Hours: Fully functional multi-agent system with an attractive UI
What we learned
Technical Insights
- Function calling is better than RAG for tool discovery: Direct API access surpasses embeddings for real-time data
- Vision models grasp wireframes: Gemini Vision consistently extracts UI structure from sketches
- LLM as a Judge is effective: Having AI review AI-generated code works surprisingly well
- Streaming UX is important: Live logs make AI much more transparent
AI Engineering Lessons
- Prompt engineering is crucial: Small wording changes can have a big impact on agent performance
- Structured output prevents drift: JSON schemas yield better responses than freeform text
- Multi-turn context improves quality: Breaking tasks into conversational steps works better than a single large prompt
- Grounding helps bridge knowledge gaps: Combining training data with real-time searches yields the best results
Product Insights
- Developers prefer AI that shows its work: Explanations of "why" are better than obscure recommendations
- Visual input makes it easier for users: Uploading a sketch feels more intuitive than writing specifications
- Real data builds trust: Displaying GitHub star counts validates recommendations
What's next for RepoMind
Backend Integration: Move the Marathon Builder to a FastAPI server for safer code execution More Tools: Add agents for researching Docker and database schemas Export Options: Allow full starter kit downloads as a ZIP file Deployment Assistant: Automatically generate Vercel and Railway configurations
Built With
- fastapi
- framer-motion
- gemini-vision
- github-api
- google-gemini-api
- google-grounding
- lucide-react
- mermaid.js
- npm-registry-api
- python
- react
- react-markdown
- reactflow
- tailwindcss
- typescript
- vite
Log in or sign up for Devpost to join the conversation.