Inspiration
As open-source contributors, we often found it difficult to understand large codebases and resolve issues efficiently. Inspired by how Perplexity AI intelligently reasons over web content, we envisioned a tool that could bring a similar capability to GitHub repositories. Our goal was to build a "mini Perplexity" for code: a tool that helps contributors navigate codebases and resolve GitHub issues effectively by understanding context from both the code and related online documentation.
What it does
RepoHelper assists open-source contributors by:
- Accepting a GitHub repository URL and issue ID.
- Fetching the issue details and exploring the codebase using the GitHub GraphQL API.
- Chunking the code intelligently and combining it with the issue context.
- Sending this context to Perplexity Sonar API to generate a helpful response.
- Displaying the response with citations and links to relevant documentation and code files.
How we built it
- Frontend: A simple Streamlit web app for input and output.
- Backend:
- Used GitHub GraphQL and REST APIs to fetch issue details and explore the repository files recursively.
- Chunked the source code for better input formatting.
- Constructed prompts combining code snippets and issue context.
- Called the Perplexity Sonar API with this enriched context.
- Displayed the generated response and source citations on the frontend.
Challenges we ran into
- We were unable to implement a knowledge graph-based indexing system due to time and resource constraints.
- The local repo crawler for semantic indexing was not completed.
- Multi-turn conversation support was planned but could not be implemented.
- Building meaningful semantic connections between related issues and source code proved to be complex.
Accomplishments that we're proud of
- Built an end-to-end prototype that integrates GitHub APIs and Perplexity Sonar API effectively.
- Developed a working tool that fetches relevant code and issue context and generates meaningful answers.
- Successfully structured code chunking and citation handling to enhance answer quality.
What we learned
- Deep insights into using GitHub GraphQL API and its potential for code exploration.
- Prompt engineering techniques for combining code and issue context.
- Real-world constraints of building context-aware developer tools and limitations of current LLM APIs.
- Importance of well-designed context windows and citation management for trustworthy AI assistance.
What's next for RepoHelper
- Implement a semantic code indexing system using a knowledge graph.
- Build a local code crawler to enhance context and provide deeper code understanding.
- Add multi-turn conversation support to allow ongoing dialog between the contributor and the AI assistant.
- Integrate issue relation mining to incorporate insights from related issues and past solutions.
- Improve the UI/UX to make the experience smoother and more intuitive for developers.
Log in or sign up for Devpost to join the conversation.