Coding Cold Cases Cracker

Main product identity: noir case desk, "Vet > Vibe > Validate" thesis.
Explains the app flow and Lark/Kiro roles.
Shows the browser terminal case desk and menu.

Inspiration

I think we've all been there: You're working on a coding project and hit a wall. So you search StackOverflow for an answer. You find one with a good title/opening question. But upon viewing the question, there is no answer, or at least no satisfactory or highly upvoted answer. And where does that leave you? I created this project to help solve that conundrum.

What it does

Coding Cold Cases Cracker treats those posts as cold support incidents rather than trivia questions. A user picks a cold case from a curated backlog, creates an isolated workspace, and starts the casework pipeline. Kiro acts as the investigator and repair engineer: it reconstructs the smallest responsible failing project, studies the evidence, and proposes the fix. Lark acts as the forensic lab: it runs the reproduction workflow before the fix, captures pass/fail evidence and logs, then runs verification after the fix. A case is only closed when Lark verification passes.

How we built it

The project is a working Dockerized prototype, not only a slide deck. It includes the web shell, browser terminal, case index parsing, isolated run workspaces, Kiro prompts, Lark workflow provisioning/execution paths, GitHub publishing hooks, gallery/report surfaces

Built with:

Lark CLI and Lark workflow groups
Kiro CLI with phase-specific agent prompts
Docker Compose
Node.js
ttyd browser terminal
Java 21
Maven and Gradle-ready case runners
GitHub workspace publishing
Markdown evidence reports and case files

Challenges we ran into

There are many reasons that a StackOverflow question remains unsolved. Among them is the difficulty to reproduce project setups that rely on very specific dependencies, drivers, or even devices. Our system tries hard to bring together the required execution environment, but it still struggles in some hard cases.

Accomplishments that we're proud of

The system has demonstrated successful resolution in many Java cold cases. Kiro used by itself to fix bugs tends to get stuck -- often being fixated on the wrong "fix", or not digging deep enough, or relying on workarounds than long-term fixes, etc. Lark puts the bug fixing agent back on track by vetting the reproduction steps, and validating the proposed fix.

What we learned

The most compelling agentic developer tool demos are not the ones where an AI says it solved something; they are the ones where the system produces replayable evidence. Lark is powerful in that role because it can turn a plain-English testing intent into workflow execution, logs, artifacts, and a verdict that a separate coding agent must answer to.

What's next for Coding Cold Cases Cracker

Unanswered developer-support incidents are costly because they lack reproducible evidence. This project turns unresolved public bug reports into replayable labs with an independent testing verdict, making answers more trustworthy than an AI-generated explanation alone.

The same pattern can become a support automation product for developer tools teams: ingest GitHub issues, Stack Overflow posts, Discord/Slack reports, or Linear tickets; reconstruct the failure; run Lark evidence workflows; propose a fix; and publish a verified case file or PR.

Built With

Updates

Eliel Goco started this project — May 28, 2026 12:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.