Inspiration

Modern development teams move fast, but QA often becomes the bottleneck. Developers either spend hours writing repetitive tests or skip proper validation entirely just to ship faster. Existing testing tools are powerful, but most are reactive — someone still has to manually create and trigger workflows.

We wanted to build something proactive: an AI-powered QA agent that automatically understands code changes, creates meaningful test workflows, runs them instantly, and blocks risky merges before bugs reach production.

The idea for LarkGuard came from a simple question:

“What if pull requests could test themselves?”

By combining GitHub Actions, Claude, and the Lark MCP ecosystem, we built a system where QA becomes autonomous instead of manual.


What it does

LarkGuard is an AI-native QA automation agent for GitHub pull requests.

Whenever a PR is opened or updated, LarkGuard:

  • Reads and analyzes the code diff
  • Detects changed APIs, features, or UI flows
  • Uses Claude with Lark MCP access to generate intelligent test workflows automatically
  • Executes those workflows instantly using Lark
  • Posts pass/fail reports directly back to GitHub
  • Blocks merges when critical workflows fail
  • Receives real-time execution events through Lark webhooks
  • Auto-comments on pull requests with actionable failure summaries

This allows teams to get instant automated QA coverage without manually writing tests or configuring workflows upfront.


How we built it

We built LarkGuard using a modular Node.js architecture integrated deeply with GitHub and the Lark ecosystem.

Core Stack

  • Node.js
  • Express.js
  • GitHub Actions
  • Claude API
  • Lark MCP
  • Lark Webhooks
  • Vercel for deployment

Workflow

  1. A developer opens or updates a pull request.
  2. GitHub Actions triggers the LarkGuard pipeline.
  3. The PR diff is extracted and passed to Claude.
  4. Claude analyzes the changed code paths and determines which workflows are needed.
  5. Using Lark MCP tools, workflows are automatically created and invoked.
  6. Results are collected and summarized.
  7. LarkGuard posts structured PR checks and comments directly on GitHub.
  8. Real-time webhook events from Lark keep PR status updated continuously.

We also designed the system prompt carefully so the AI agent could:

  • Distinguish between regressions and new features
  • Reuse workflows intelligently
  • Generate deterministic or AI-driven tests depending on context
  • Return machine-readable outputs for CI automation

Challenges we ran into

One of the biggest challenges was making the AI-generated workflows reliable instead of generic. Initially, the agent produced workflows that were too broad or duplicated existing tests.

We solved this by:

  • Improving prompt engineering
  • Structuring PR diff analysis more carefully
  • Adding naming and reuse logic
  • Separating regression workflows from exploratory AI-driven workflows

Another challenge was synchronizing asynchronous workflow execution with GitHub PR statuses. Since Lark workflows can take time to complete, we needed a clean system for:

  • Waiting for executions
  • Polling results
  • Handling webhook events
  • Updating PR comments in real time

We also had to ensure the entire developer experience remained simple enough that teams with zero Lark setup could onboard immediately.


Accomplishments that we're proud of

We are proud that LarkGuard turns QA into an autonomous workflow rather than another engineering task.

Some highlights include:

  • Automatically generating workflows directly from PR diffs
  • Fully integrating GitHub Actions with Lark MCP
  • Real-time webhook-driven PR feedback
  • Creating a zero-setup QA experience for teams
  • Designing an AI agent that understands engineering context instead of blindly generating tests

We are especially proud of the overall developer experience — developers can simply open a PR and immediately receive intelligent QA validation without writing additional configuration.


What we learned

This project taught us a lot about building reliable AI agents for developer tooling.

Some key lessons:

  • Prompt engineering matters significantly when building autonomous coding workflows
  • Developers value actionable feedback more than verbose AI outputs
  • Real-time integrations and event-driven architecture are critical for CI/CD automation
  • AI works best when combined with deterministic infrastructure rather than replacing it entirely

We also learned how powerful the Lark MCP ecosystem can be when paired with modern LLMs like Claude.


What's next for LarkGuard

We want to evolve LarkGuard into a fully autonomous AI QA platform.

Our roadmap includes:

  • Workflow memory and historical regression learning
  • Smarter flaky-test detection
  • Support for multi-repository and monorepo environments
  • Parallel workflow orchestration
  • Slack and Discord notifications
  • Visual QA dashboards
  • Security and performance workflow generation
  • AI-generated reproduction steps for failed tests
  • Support for more CI providers beyond GitHub Actions

Long term, we envision LarkGuard becoming an intelligent QA teammate that continuously protects production systems while reducing manual testing overhead for developers.

Built With

Share this project:

Updates