Inspiration

Every engineering team has felt the same pain: CI fails, logs are noisy, the failure is hard to reproduce, and the real root cause hides somewhere between code, config, dependencies, fixtures, and environment drift.

We built LarkOps AI because failure reproduction is the bottleneck. A normal chatbot can summarize logs, but it cannot turn a broken test into a durable, replayable engineering workflow. LarkOps AI uses Lark CLI/MCP as the execution layer so every investigation becomes structured, reproducible, explainable, and review-ready.

What it does

LarkOps AI is an autonomous debugging command center for software failures.

A developer can open a failing CI run or user-reported issue, start an investigation, and watch the full lifecycle:

  • Failure received from CI
  • Lark workflow initialized
  • Environment prepared
  • Failing test reproduced
  • Logs analyzed
  • Multi-agent investigation started
  • Root cause identified
  • Patch suggested
  • Regression test generated
  • Fix verified
  • PR summary prepared
  • Lark replay and workflow metadata captured

The main demo investigates an auth middleware failure where code assumes user.profile.name always exists. LarkOps AI reproduces the failure, identifies the null dereference, proposes a safe patch, generates a regression test, verifies the result, analyzes blast radius, and prepares a senior-engineer-quality PR summary.

How we built it

We built LarkOps AI as a polished Vite + React + TypeScript application with a premium developer-infrastructure feel inspired by tools like Linear, Vercel, Sentry, Datadog, and Warp.

The product uses:

  • React and TypeScript for the frontend
  • Vite for fast development and production builds
  • Lark CLI for real workflow creation and execution
  • Lark workflow metadata, execution IDs, and events as core product primitives
  • Gemini as the default AI provider path
  • Deterministic demo mode so the hackathon demo remains reliable even if external AI quota is unavailable
  • GitHub Pages for deployment

We created and invoked a real Lark workflow from the CLI:

  • Workflow ID: wflw_sEkzdOqOLAutmd5dXsbxUkua
  • Execution ID: wflw_exec_i8dNNfBsn0p1gvgkIgA8wwdp
  • Execution result: success

That was important to us because Lark is not just a logo in the UI. It is the execution and replay layer behind the product story.

Challenges we ran into

The biggest challenge was making the product feel real rather than like a fake dashboard.

We had to design a workflow that clearly showed why Lark matters: not just “AI analyzed logs,” but “a reproducible workflow executed, produced evidence, generated events, and left behind a replayable investigation record.”

We also had to balance live integrations with demo reliability. Gemini was connected, but live API quota can be unpredictable during a hackathon. To avoid a broken judging experience, we built deterministic outputs while still showing the real Gemini/Lark connection status.

Another challenge was deployment. GitHub Pages needed a branch-based deployment setup for this repository, so we configured the production build and Pages deployment path carefully.

Accomplishments that we're proud of

We are proud that LarkOps AI feels like a real developer product, not a generic AI wrapper.

The investigation detail page is the heart of the product. It includes:

  • Lark workflow timeline
  • Terminal reproduction logs
  • Multi-agent investigation
  • Root cause explanation
  • Proposed patch
  • Regression test
  • Verification results
  • Blast radius analysis
  • Confidence gate
  • Failure memory
  • Lark usage events
  • Incident postmortem export
  • Review-ready PR preview

We are also proud that we verified a real Lark CLI workflow and surfaced that workflow ID and execution ID inside the product.

What we learned

We learned that the most valuable AI developer tools are not just chat interfaces. Developers need systems that produce evidence, verification, and repeatability.

A failure investigation becomes much more useful when it has:

  • a workflow ID
  • an execution timeline
  • logs
  • replayability
  • generated regression coverage
  • confidence gates
  • risk analysis
  • a PR-ready artifact

Lark is a strong fit for this because it gives AI-assisted debugging an execution layer instead of leaving it as a conversation.

What's next for LarkOps AI

Next, we would turn LarkOps AI into a real engineering workflow platform:

  • GitHub App that comments on failing CI runs with Lark investigation results
  • Linear integration for user-reported bugs
  • Slack bot that can trigger a Lark reproduction workflow from a failure report
  • Automatic PR creation after verification passes
  • Team-level failure memory across repositories
  • Flaky test classifier trained on past Lark executions
  • Scheduled Lark workflows that monitor test suites and open fixes automatically
  • Rich Lark replay pages attached to pull requests and postmortems

The long-term vision is simple: every software failure should become a replayable investigation, and every verified fix should come with evidence.

Built With

  • ai-agents
  • ci
  • gemini-api
  • github-actions
  • json/sqlite-mock-data
  • lark-cli
  • lark-mcp
  • next.js
  • node.js
  • shadcn/ui
  • tailwind-css
  • typescript
  • vercel
  • workflow
Share this project:

Updates