Inspiration

As open-source contributors, we often found it difficult to understand large codebases and resolve issues efficiently. Inspired by how Perplexity AI intelligently reasons over web content, we envisioned a tool that could bring a similar capability to GitHub repositories. Our goal was to build a "mini Perplexity" for code: a tool that helps contributors navigate codebases and resolve GitHub issues effectively by understanding context from both the code and related online documentation.

What it does

RepoHelper assists open-source contributors by:

  • Accepting a GitHub repository URL and issue ID.
  • Fetching the issue details and exploring the codebase using the GitHub GraphQL API.
  • Chunking the code intelligently and combining it with the issue context.
  • Sending this context to Perplexity Sonar API to generate a helpful response.
  • Displaying the response with citations and links to relevant documentation and code files.

How we built it

  • Frontend: A simple Streamlit web app for input and output.
  • Backend:
    • Used GitHub GraphQL and REST APIs to fetch issue details and explore the repository files recursively.
    • Chunked the source code for better input formatting.
    • Constructed prompts combining code snippets and issue context.
    • Called the Perplexity Sonar API with this enriched context.
    • Displayed the generated response and source citations on the frontend.

Challenges we ran into

  • We were unable to implement a knowledge graph-based indexing system due to time and resource constraints.
  • The local repo crawler for semantic indexing was not completed.
  • Multi-turn conversation support was planned but could not be implemented.
  • Building meaningful semantic connections between related issues and source code proved to be complex.

Accomplishments that we're proud of

  • Built an end-to-end prototype that integrates GitHub APIs and Perplexity Sonar API effectively.
  • Developed a working tool that fetches relevant code and issue context and generates meaningful answers.
  • Successfully structured code chunking and citation handling to enhance answer quality.

What we learned

  • Deep insights into using GitHub GraphQL API and its potential for code exploration.
  • Prompt engineering techniques for combining code and issue context.
  • Real-world constraints of building context-aware developer tools and limitations of current LLM APIs.
  • Importance of well-designed context windows and citation management for trustworthy AI assistance.

What's next for RepoHelper

  • Implement a semantic code indexing system using a knowledge graph.
  • Build a local code crawler to enhance context and provide deeper code understanding.
  • Add multi-turn conversation support to allow ongoing dialog between the contributor and the AI assistant.
  • Integrate issue relation mining to incorporate insights from related issues and past solutions.
  • Improve the UI/UX to make the experience smoother and more intuitive for developers.

Built With

Share this project:

Updates