Sentry metric: https://youtu.be/ctUCtMgi9pA We use AI coding tools like Codex all the time. They are great at writing snippets, but the process around the code is still manual. I still have to run it, debug it, open PRs, and wait for reviews.
Aadit, heard a Principal Engineer talk about how he uses Agents to create PR's at night, and reviews those when he wakes up in the morning. This makes the PE to be very effecient
We thought, we should build this !
What if you could drop in a repo link and a task, and an AI system would not just write code, but also run it, trace it, and ship it as a PR?
That idea became Daytona PR Copilot.
How we built It
The project connects a few tools into one automated pipeline:
A Daytona sandbox spins up and clones the repo
An LLM generates the code changes for the task
The project runs inside the sandbox
Sentry captures runtime errors and traces
If things look good, a branch is pushed and a PR is opened automatically
CodeRabbit is triggered to do an instant AI code review
So instead of “AI wrote some code,” you get AI shipped a tested, traced pull request.
Win - We just reduced the PR cycle time from 2 hours to 3 minutes! We are definitely going to use it.
What we Learned
The biggest thing I learned is that AI coding gets much more powerful when it can see runtime behavior, not just source code.
I also learned how much of software development is actually workflow. Environments, testing, and reviewing are just as important as writing code, and those steps can be automated too.
Productionalizing the product -
- Integrate with enterprise version control systems
- Simulate customer traffic in sandboxes
- Run Sentry within the sandbox
Challenges
One challenge was connecting code generation with real execution. LLMs do not naturally understand runtime failures, so I had to rely on sandbox runs and Sentry traces as feedback signals.
Another challenge was automating all the small developer habits like structuring commits and PRs in a way that still feels human.
LLMs hallucinate fixes when they can't see runtime state. Our solution: we pipe Sentry stack traces directly into the prompt context, letting the model see exactly what failed and why
This project explores a simple idea:
Task→Code→Run→Trace→PR
Not just AI that writes code, but AI that finishes the job.
Built With
- claude
- coderabbit
- daytona
- express.js
- github
- node.js
- sentry
- sse
- tsx
- typescript
Log in or sign up for Devpost to join the conversation.