Inspiration
Modern development teams move fast, but QA often becomes the bottleneck. Developers either spend hours writing repetitive tests or skip proper validation entirely just to ship faster. Existing testing tools are powerful, but most are reactive — someone still has to manually create and trigger workflows.
We wanted to build something proactive: an AI-powered QA agent that automatically understands code changes, creates meaningful test workflows, runs them instantly, and blocks risky merges before bugs reach production.
The idea for LarkGuard came from a simple question:
“What if pull requests could test themselves?”
By combining GitHub Actions, Claude, and the Lark MCP ecosystem, we built a system where QA becomes autonomous instead of manual.
What it does
LarkGuard is an AI-native QA automation agent for GitHub pull requests.
Whenever a PR is opened or updated, LarkGuard:
- Reads and analyzes the code diff
- Detects changed APIs, features, or UI flows
- Uses Claude with Lark MCP access to generate intelligent test workflows automatically
- Executes those workflows instantly using Lark
- Posts pass/fail reports directly back to GitHub
- Blocks merges when critical workflows fail
- Receives real-time execution events through Lark webhooks
- Auto-comments on pull requests with actionable failure summaries
This allows teams to get instant automated QA coverage without manually writing tests or configuring workflows upfront.
How we built it
We built LarkGuard using a modular Node.js architecture integrated deeply with GitHub and the Lark ecosystem.
Core Stack
- Node.js
- Express.js
- GitHub Actions
- Claude API
- Lark MCP
- Lark Webhooks
- Vercel for deployment
Workflow
- A developer opens or updates a pull request.
- GitHub Actions triggers the LarkGuard pipeline.
- The PR diff is extracted and passed to Claude.
- Claude analyzes the changed code paths and determines which workflows are needed.
- Using Lark MCP tools, workflows are automatically created and invoked.
- Results are collected and summarized.
- LarkGuard posts structured PR checks and comments directly on GitHub.
- Real-time webhook events from Lark keep PR status updated continuously.
We also designed the system prompt carefully so the AI agent could:
- Distinguish between regressions and new features
- Reuse workflows intelligently
- Generate deterministic or AI-driven tests depending on context
- Return machine-readable outputs for CI automation
Challenges we ran into
One of the biggest challenges was making the AI-generated workflows reliable instead of generic. Initially, the agent produced workflows that were too broad or duplicated existing tests.
We solved this by:
- Improving prompt engineering
- Structuring PR diff analysis more carefully
- Adding naming and reuse logic
- Separating regression workflows from exploratory AI-driven workflows
Another challenge was synchronizing asynchronous workflow execution with GitHub PR statuses. Since Lark workflows can take time to complete, we needed a clean system for:
- Waiting for executions
- Polling results
- Handling webhook events
- Updating PR comments in real time
We also had to ensure the entire developer experience remained simple enough that teams with zero Lark setup could onboard immediately.
Accomplishments that we're proud of
We are proud that LarkGuard turns QA into an autonomous workflow rather than another engineering task.
Some highlights include:
- Automatically generating workflows directly from PR diffs
- Fully integrating GitHub Actions with Lark MCP
- Real-time webhook-driven PR feedback
- Creating a zero-setup QA experience for teams
- Designing an AI agent that understands engineering context instead of blindly generating tests
We are especially proud of the overall developer experience — developers can simply open a PR and immediately receive intelligent QA validation without writing additional configuration.
What we learned
This project taught us a lot about building reliable AI agents for developer tooling.
Some key lessons:
- Prompt engineering matters significantly when building autonomous coding workflows
- Developers value actionable feedback more than verbose AI outputs
- Real-time integrations and event-driven architecture are critical for CI/CD automation
- AI works best when combined with deterministic infrastructure rather than replacing it entirely
We also learned how powerful the Lark MCP ecosystem can be when paired with modern LLMs like Claude.
What's next for LarkGuard
We want to evolve LarkGuard into a fully autonomous AI QA platform.
Our roadmap includes:
- Workflow memory and historical regression learning
- Smarter flaky-test detection
- Support for multi-repository and monorepo environments
- Parallel workflow orchestration
- Slack and Discord notifications
- Visual QA dashboards
- Security and performance workflow generation
- AI-generated reproduction steps for failed tests
- Support for more CI providers beyond GitHub Actions
Long term, we envision LarkGuard becoming an intelligent QA teammate that continuously protects production systems while reducing manual testing overhead for developers.
Built With
- actions
- api
- axios
- claude
- dotenv
- express.js
- github
- javascript
- lark
- local
- mcp
- ngrok
- rest
- vercel
- webhook
- webhooks
Log in or sign up for Devpost to join the conversation.