Inspiration
"Need: Al that scans PR #1024 and comments: 'This is a duplicate of PR #20' Kinda hard sometimes to find who fixed a bug or implemented a feature first and give attribution. " ~ Shadcn.
Open source maintainer a fighting a losing battle against people who make slight modifications, implement an existing PR or worse straight up duplicates a PR. This cannot continue and 'Pudding' is solving this problem
What it does
Pudding is a Gemini powered assistant that identifies duplicate PRs by understanding their intent, not just comparing code diffs. We initially run the program with Bag-of-Words embeddings and Jaccard file overlap filter thousands of PRs down to a few candidates in milliseconds, then on selected few Gemini performs semantic comparison by extracts structured intent (problem, component, behavior, code approach) where its verified whether the PRs are duplicated or not and then output is commented!
How we built it
We built a funnel-based architecture:
Stage 1-2 (Fast Local Filters): Bag-of-Words embeddings and Jaccard file overlap filter thousands of PRs down to a few candidates in milliseconds
Stage 3-5 (LLM Reasoning): Gemini 3 extracts structured intent (problem, component, behavior, code approach) and performs semantic comparison on remaining candidates
Challenges we ran into
Mock embeddings producing garbage: Initial Math.random() vector generation made all embeddings orthogonal. Replaced with deterministic Bag-of-Words (Hashing Trick) to ensure similar text yields similar vectors.
Gemini JSON response inconsistency: Gemini sometimes returned JSON wrapped in markdown code blocks instead of raw JSON. Had to implement regex stripping and fallback parsing with JSON.parse() wrapped in try-catch.
TypeScript generic inference: generateJSON() function lacked a generic type parameter , causing TypeScript errors at call sites like generateJSON(prompt). Fixed by adding to the function signature.
GitHub API diff gaps: The Files API returns patch (unified diff) only for text files under a size limit. Large binary files or renamed files have no patch data, causing undefined in our diff concatenation.
Confidence floor logic: Initial weighted scoring produced confusingly low scores (60%) for clear duplicates when one factor was low. Added a "confidence floor" (85% minimum) when Gemini's semantic similarity exceeds 90%.
Array response normalization: Gemini occasionally returned [{pr1: ...}, {pr2: ...}] instead of {pr1: ..., pr2: ...}. Frontend had to detect arrays and reduce() them into a single object.
Accomplishments that we're proud of
97%+ accuracy on test duplicate pairs with the weighted scoring system
Sub-second filtering for the first 2 stages using local embeddings
Intent-aware analysis: The system understands that "fix auth bug for special chars" and "handle symbols in passwords" are duplicates even with different code
What we learned
Response format enforcement: Adding responseMimeType: 'application/json' to Gemini config drastically reduces markdown-wrapped responses, but doesn't eliminate them entirely.
Temperature = 0.1 for consistency: Low temperature makes Gemini's structured outputs reproducible. Higher values caused random field ordering and inconsistent scoring.
Few-shot isn't always needed: Explicit JSON schema in the prompt with field descriptions worked better than few-shot examples for our use case. Retry on 503, not 429: 503 (overloaded) is transient and worth retrying. 429 (rate limit) means backoff and wait. Different error handling for each.
Confidence floors prevent confusion: Raw weighted averages can produce misleading scores. Floors and boosts based on key signals improve user trust.
What's next for Pudding
GitHub App integration: Automatic PR comments when duplicates are detected
Feedback loop: Learn from user corrections to improve thresholds per-repository
Semantic code similarity: Add AST-level analysis alongside text diff comparison
Multi-repo support: Detect duplicates across forks and related repositories
Real-time GitHub webhooks: Auto-trigger analysis on new PR events
Built With
- express.js
- gemini-api
- github-api
- javascript
- jest
- node.js
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.