Inspiration Our inspiration came from the desire to demystify open-source contributions. We asked ourselves, echoing GitLab's own innovative spirit: "What if we could guide developers directly to open-source issues where their unique skills and passions can make a real impact?" We envisioned an AI-powered compass, leveraging MongoDB Vector Search, to navigate the vast landscape of projects and help developers grow their careers through meaningful contributions.
What it does Open Source Compass is an AI-powered discovery tool that connects developers with relevant open-source issues. Users input their skills, interests, or project preferences, and our system uses MongoDB Vector Search to semantically match these queries against a database of open-source project details and issues. It helps users find tailored opportunities to contribute, build their portfolios, and gain valuable experience.
How we built it Data Foundation: We set up a MongoDB Atlas cluster and defined a schema for open-source project information (name, description, READMEs) and issues (title, body, labels). Ingestion & Embeddings: Python scripts were developed to fetch data from open-source repositories. We then generated vector embeddings for key textual data (issue descriptions, READMEs) using a sentence transformer model and stored them in MongoDB. AI Search Core: A Vector Search Index was created in MongoDB Atlas. The core search logic uses MongoDB Aggregation Pipelines with the $vectorSearch stage to find semantically similar issues based on user input. User Interface: A simple and intuitive frontend was built using Streamlit to take user queries and display matched open-source opportunities. Collaboration: We used GitLab for version control, and as a team, leveraged AI tools like GitLab Duo to enhance our development speed and brainstorming. Tech Stack: MongoDB Atlas (Vector Search), Python, Sentence Transformers, Streamlit, GitLab.
Challenges we ran into Data Variability: Handling the diverse formats and quality of text in open-source issues and READMEs for effective embedding generation was tricky. Tuning Semantic Relevance: Achieving truly meaningful semantic matches, beyond just keywords, required careful iteration on our embedding strategy and search queries. Time Constraints: prioritizing core features and managing scope was a constant challenge.
Accomplishments that we're proud of Successfully implementing MongoDB Vector Search to perform effective semantic matching on real-world open-source data. Building a functional end-to-end application, from data ingestion to a user-facing AI search, within the hackathon timeline. Creating a tool that has the potential to genuinely help developers find and engage with open-source projects. Effectively collaborating and utilizing tools like GitLab Duo to build "Software. Faster."
What we learned The immense power and practicality of MongoDB Vector Search for building AI-driven applications. Best practices for data modeling in MongoDB, especially when incorporating vector embeddings. The efficiency of Python and Streamlit for rapid AI/ML-powered application development. The importance of clear data preprocessing for quality AI model inputs. How collaborative AI tools can significantly boost development productivity.
What's next for Open Source Compass Expanding Data Sources: Ingesting issues and project details directly from GitLab.com open-source repositories using the GitLab API. Personalized User Profiles: Allowing users to save their skills and interests for more tailored, proactive recommendations. Advanced Filtering & Ranking: Incorporating more sophisticated filters (project activity, "good first issue" tags, required tech stack) and ranking algorithms. Feedback Loop: Allowing users to provide feedback on match quality to further refine the AI models. GitLab Integration: Potentially suggesting Open Source Compass as a tool for contributors within the GitLab ecosystem or even using it to identify contribution opportunities within GitLab itself.
Enterprise Compass - Beyond just open source projects, This tool could also be integrated to Gitlab and leveraged within Enterprises to feed Organization's Repositories to a common tool and leverage AI Ranking to allow core developers to showcase and contribute across the organization's projects
Log in or sign up for Devpost to join the conversation.