Inspiration
Code reviews are often treated as a "one-size-fits-all" process, but not all Pull Requests carry the same risk. We noticed that teams frequently struggle to prioritize reviews because a simple Line of Code (LOC) count is a poor proxy for actual complexity. A change of 50 lines spread across 10 different files is significantly riskier and harder to grasp than 100 lines in a single file. We were inspired to create a tool that quantifies this "Blast Radius," providing immediate mental context to reviewers and helping teams identify when a task has grown too complex.
What it does
Pull Request Scoring for Bitbucket automatically calculates a Complexity Score for every PR. The app evaluates the structural impact of changes rather than just volume using a weighted formula we call the Blast Radius Factor:
$$ComplexityScore = (linesAdded + linesRemoved) \times filesChanged^{1.2}$$
The app renders a native panel in the Bitbucket PR view that provides:
- Live Complexity Dashboard: A visual traffic-light system (Green < 100, Yellow 100-1000, Red > 1000) to signal cognitive load.
- Smart Metrics Breakdown: Displays essential data points including Lines Added/Removed, Files Changed, and Comment Count to give reviewers a full snapshot of the PR's health at a glance.
How we built it
The application is built on the Atlassian Forge platform using Custom UI. This architectural choice was intentional: it granted us the creative freedom to build a sophisticated frontend while keeping the app securely within the Atlassian infrastructure.
Our technical backbone relies on tRPC for type-safe communication between the backend and frontend. To ensure perfect visual harmony with Bitbucket, we built a custom Tailwind CSS plugin that integrates Atlassian Design Tokens directly into our utility-first styling workflow. This allows us to leverage Tailwind's flexibility while remaining 100% compliant with the Atlassian Design System's semantic tokens and theming (Light/Dark mode).
Technical Architecture & Stack
- Atlassian Forge Custom UI: Chosen for a rich, flexible user interface that goes beyond standard UI kits.
- TypeScript & tRPC: A 100% type-safe bridge between our Forge backend resolvers and the React frontend.
- Tailwind CSS + Proprietary Plugin: Mapping Atlassian Design Tokens to Tailwind classes for native-feeling UI.
- TanStack Query: Managing async-states, caching, and updates for PR metrics.
- Atlaskit: Using official React components for complex UI patterns like panels and icons.
- Zod & i18next: For robust schema validation and full internationalization support.
Challenges we ran into
Defining a formula that felt "fair" across different projects was a major challenge. We iterated on the Blast Radius exponent to ensure that increasing the number of files changed penalized the score more heavily than just adding lines to a single file. Technically, setting up a type-safe tRPC architecture within the specific serverless constraints of Forge also required significant research and custom configuration.
Accomplishments that we're proud of
We successfully created a seamless integration that feels like a native part of Bitbucket. Achieving a real-time calculation with an end-to-end type-safe stack (tRPC + Zod) on Forge is a technical milestone we are particularly proud of.
What we learned
We gained deep expertise in the Atlassian Forge ecosystem and serverless architecture. We also learned that providing "at-a-glance" visual cues—powered by a custom Design Tokens integration—is far more effective for developer productivity than raw data tables.
What's next for Pull Request Scoring for Bitbucket
Our roadmap includes:
- Customization: Allowing users to define their own scoring variables, formulas, and threshold colors.
- Smart File Filters: Implementing a mechanism to exclude specific files from the complexity calculation. This will allow teams to ignore noise from auto-generated files or lockfiles, such as
package-lock.jsonor documentation assets, ensuring the score reflects only meaningful code changes. - Jira Integration: Comparing PR Complexity against Jira Story Points to identify "under-estimated" tasks. This implies that the app has potential to become cross-product, although integrations with existing estimation apps are also being considered.
- Advanced Approvals: Automatically requiring more senior reviewers for PRs that exceed a "Red" complexity threshold.
- AI workflows: The biggest risk in AI-driven development and vibe coding is the lack of visibility for problems introduced by copilots, including duplicated code, inconsistencies, or file entropy. PR Scoring has the potential to become a best practice in these workflows by offering strong indicators of risk and therefore accelerating the success of software teams.
Built With
- atlaskit
- forge
- i18next
- node.js
- react
- tailwindcss
- tanstack-query
- trpc
- typescript
- zod

Log in or sign up for Devpost to join the conversation.