Momus is a PR-native unit test generator that turns “ship it” into “ship it with tests.” It watches what changed in a pull request, generates the missing tests, runs them, checks coverage, and then opens a clean PR with only the test updates - so reviewers can merge with confidence instead of crossed fingers.
The problem
Even good teams get stuck in the same loop:
- features land fast
- tests lag behind
- reviewers ask for coverage
- devs scramble, CI breaks, everyone loses momentum
And when “AI test generation” is tried, it often produces:
- tests that don’t run
- tests that don’t match project conventions
- tests that are hard to trust
Momus is built to be measurable, reviewable, and CI-friendly.
What Momus does
Diff-aware test generation
- Focuses on what actually changed in the PR (not the whole repo).
- Targets the affected modules and branches.
Quality loop, not one-shot
- Generate tests --> run tests --> measure coverage --> refine.
- Stops when targets are met or progress stalls (so it doesn’t loop forever).
Sandboxed execution
- Runs tests in an isolated environment so generated code can’t nuke your machine or hang forever.
- Produces coverage artifacts for visibility.
Reliability scoring
- Combines static signals (syntax/lint/type checks), runtime results, and uncertainty/confidence signals to label output like trusted / needs review / discard.
- The point: you get a signal, not a surprise.
PR workflow that feels natural
- Momus doesn’t mutate your branch behind your back.
- It creates an AI branch and opens a tests-only PR back into your feature branch.
- Review is clean: “here are the tests I added for your change.”
Why this is different from “just generate tests”
Momus isn’t trying to be a magical test vending machine. It’s built like a teammate:
- it shows its work (coverage + logs + artifacts)
- it fails loudly with actionable errors (bad auth, missing deps, failing tests)
- it’s incremental and measurable, not “spray tests and pray”
Architecture (high level)
Forge app (Bitbucket Cloud integration)
- Triggered via PR comment or manual webtrigger fallback.
- Kicks off work and posts results back to the PR.
External worker
- Clones the repo at the PR commit.
- Runs the pipeline: generate --> run --> measure --> PR back with changes.
- Uses Bitbucket auth tokens (modern token auth).
Core test-gen engine (QUEST)
- Multi-agent loop (generator/supervisor/enhancer style).
- Static analysis + execution feedback + reliability scoring.
- Observability with artifact logging and a Streamlit dashboard.
What you get as a developer
- A comment trail on the PR that’s actually useful:
- job started
- results, coverage, and failures if any
- link to the generated “tests-only” PR
- Faster merges because the tests show up with the change, not in a follow-up scramble.
Current MVP vs roadmap
MVP (hackathon-ready)
- Bitbucket Cloud + Forge trigger
- Python support
- Tests-only PR creation
- Coverage + logs + artifact hygiene
Roadmap
- Add adapters for JS/Jest, Java/JUnit, etc.
- Smarter diff-to-test mapping (symbol tracing + existing test discovery)
- Mutation testing as a first-class signal
- “merge check” enforcement (block merges if coverage drops)
Built With
- atlassian
- bitbucket
- fastapi
- pytest
- python
- typescript
Log in or sign up for Devpost to join the conversation.