Scriptless.ai - Hackathon Submission

💡 Inspiration

The inspiration for Scriptless.ai came from a frustrating reality we've all experienced: testing is tedious, time-consuming, and often neglected.

We noticed that:

  • Developers spend countless hours writing and maintaining UI tests instead of building features
  • Non-technical team members struggle to validate product behavior without coding skills
  • Failed tests are cryptic and require deep debugging to understand what went wrong
  • Manual testing is error-prone and doesn't scale with product complexity

We asked ourselves: What if AI could do all the heavy lifting? What if you could connect your GitHub repo and instantly get a full suite of executable browser tests, complete with failure diagnostics and hands-free voice control?

That's how Scriptless.ai was born — to make testing as effortless as possible, so teams can focus on building great products instead of battling brittle test scripts.


🎯 What it does

Scriptless.ai is an AI-driven testing workspace that automates the entire test lifecycle for web applications:

  1. Connect Your Repository — Authorize GitHub OAuth and select any repository
  2. AI Generates Test Cases — Scans your source code and creates structured test scenarios using NVIDIA Nemotron AI
  3. Run Tests in Cloud Browsers — Executes tests in real Browserbase sessions with Playwright automation
  4. AI Script Generation — Google Gemini generates resilient Playwright scripts on-demand
  5. Watch Session Replays — Every test run is recorded for visual debugging
  6. AI Explains Failures — NVIDIA Vision models analyze screenshots and suggest fixes
  7. Voice Commands — Control everything hands-free with Speechmatics real-time transcription

In short: You connect a repo, AI generates and runs tests, and you get actionable insights — all without writing a single line of test code.


🔨 How we built it

Building Scriptless.ai required orchestrating multiple cutting-edge technologies into a cohesive workflow:

Frontend

  • Next.js 16 (App Router) with React 19 and TypeScript for a modern, type-safe UI
  • Tailwind CSS for rapid, responsive styling
  • Radix UI components for accessible dialogs, accordions, and controls
  • Framer Motion for smooth animations

Backend & Database

  • Neon Postgres (serverless) for scalable data storage
  • Drizzle ORM for type-safe database operations
  • Clerk for authentication and user management
  • GitHub OAuth for secure repository access

AI Pipeline (Multi-Model Approach)

  1. NVIDIA Nemotron (nvidia/nemotron-3-super-120b-a12b) — Scans repo files and generates structured test cases
  2. Google Gemini (gemini-3.1-flash-lite) — Generates executable Playwright scripts with resilient selectors
  3. NVIDIA Vision (nvidia/nemotron-3-nano-omni-30b-a3b-reasoning) — Analyzes failure screenshots and suggests fixes
  4. Speechmatics Real-Time API — Transcribes voice commands with natural language parsing

Automation & Execution

  • Browserbase SDK — Cloud browser infrastructure for running tests at scale
  • Playwright Core — Browser automation engine for executing scripts
  • HLS.js — Session replay video player for watching test runs

Payments & Credits

  • Stripe — Checkout sessions and webhook handling (foundation for monetization)
  • Custom Credit System — Deducts credits for test generation (200) and execution (70-100)

Deployment

  • Vercel — Continuous deployment with edge runtime support

Development Workflow:

  • Built API routes for GitHub integration, test generation, execution, and voice commands
  • Designed a credit-based usage model to control AI costs
  • Implemented real-time voice transcription with command parsing
  • Created a multi-step user journey from onboarding to test execution

🚧 Challenges we ran into

Building Scriptless.ai came with several tough challenges:

1. AI Model Coordination

Orchestrating four different AI models (NVIDIA text, NVIDIA vision, Google Gemini, Speechmatics) was complex. Each has different API interfaces, rate limits, and response formats. We had to build robust error handling and fallback chains to ensure reliability.

2. GitHub OAuth + Clerk Integration

Integrating GitHub OAuth while using Clerk for authentication required careful token management. We stored GitHub tokens in HTTP-only cookies to prevent XSS attacks while keeping them accessible for API calls.

3. Playwright Script Generation Reliability

Getting AI to generate working Playwright scripts was challenging. Early scripts used brittle selectors that broke easily. We solved this by:

  • Providing detailed prompt engineering with examples
  • Creating helper functions (firstVisibleLocator, resilientClick) that the AI could use
  • Implementing a caching system so scripts improve over time

4. Browserbase Session Management

Managing cloud browser sessions, handling timeouts, and capturing session recordings required careful orchestration. We had to implement retry logic and proper cleanup to avoid leaking resources.

5. Voice Command Parsing

Converting raw Speechmatics transcripts into actionable commands was tricky. We built a custom NLP parser with:

  • Stopword removal
  • Word stemming
  • Intent matching with fuzzy logic
  • Handling natural language variations ("run tests" vs "execute the test cases")

6. Real-Time Credit Tracking

Ensuring credit deductions happened atomically and accurately across concurrent requests required careful database transaction management with Drizzle ORM.

7. File Context Extraction

Deciding which files to send to the AI for test generation was challenging. We implemented smart filtering to prioritize routes, components, and pages while respecting the 5,000-character limit per file.


🏆 Accomplishments that we're proud of

1. End-to-End AI Test Automation

We built a complete pipeline from code scanning to test execution to failure analysis — all powered by AI. No other tool combines all these capabilities in one seamless workflow.

2. Multi-Model AI Architecture

Successfully orchestrated four different AI models, each optimized for its specific task. This modular approach gives us flexibility and keeps costs manageable.

3. Voice-Controlled Testing

We're especially proud of the voice command feature. It's not just a gimmick — it genuinely makes testing more accessible and hands-free, which is perfect for demos and exploratory testing.

4. Resilient Script Generation

Our Playwright scripts include helper functions that make them more robust than typical AI-generated code. The firstVisibleLocator and resilientClick patterns significantly reduce flakiness.

5. AI Vision Failure Analysis

The ability to automatically diagnose failures using vision models is groundbreaking. Instead of staring at cryptic error logs, users get plain-English explanations of what went wrong.

6. Production-Ready Infrastructure

We didn't just build a demo — we built a scalable, credit-based SaaS platform with:

  • User authentication
  • GitHub integration
  • Payment processing (Stripe foundation)
  • Database-backed state management
  • Cloud browser execution at scale

7. Comprehensive Documentation

We created detailed documentation (README, HACKATHON.md) that explains every aspect of the system, making it easy for developers to understand and contribute.


📚 What we learned

Technical Learnings

  1. Multi-Model AI Orchestration

    • How to design fallback chains for reliability
    • When to use specialized models vs. general-purpose ones
    • Cost optimization through strategic model selection
  2. Prompt Engineering for Code Generation

    • The importance of providing examples and patterns
    • How to structure prompts for consistent JSON output
    • Iterative refinement based on real-world failures
  3. Browser Automation at Scale

    • Managing cloud browser sessions with Browserbase
    • Handling timeouts, retries, and resource cleanup
    • Capturing and serving session recordings
  4. Real-Time Voice Processing

    • Streaming audio to transcription APIs
    • Building NLP parsers for command recognition
    • Handling microphone permissions and audio formats
  5. Credit-Based Monetization

    • Designing fair pricing models for AI-powered features
    • Atomic credit deduction with database transactions
    • Balancing free tier generosity with sustainability

Product Learnings

  1. User Onboarding is Critical

    • Clear empty states guide users through the first steps
    • Starter credits lower the barrier to trying the product
    • GitHub OAuth makes setup nearly frictionless
  2. AI Transparency Builds Trust

    • Showing generated scripts builds confidence
    • Explaining failures helps users understand AI limitations
    • Credit visibility prevents surprise costs
  3. Accessibility Matters

    • Voice commands aren't just cool — they're genuinely useful
    • Status indicators help users understand system state
    • Clear error messages prevent frustration

Team & Process Learnings

  1. Start with the Core Loop

    • We prioritized the connect → generate → run → analyze flow
    • Polish came after we validated the core value proposition
  2. Iterate Based on Real Usage

    • Early test scripts were too brittle — we refined the prompt
    • Users wanted custom instructions — we added per-run overrides
  3. Documentation is a Force Multiplier

    • Good docs make the project accessible to contributors
    • Clear architecture decisions prevent future confusion

🚀 What's next for Scriptless.ai

We have an ambitious roadmap to make Scriptless.ai the go-to testing platform for modern web applications:

Short-Term (Next 3 Months)

  1. CI/CD Integration

    • GitHub Actions workflow for auto-running tests on every PR
    • GitLab CI and Jenkins support
    • Commit status checks with pass/fail results
  2. Improved Test Intelligence

    • Automatic selector healing when UI changes
    • Flaky test detection and retry logic
    • Learning from past runs to improve accuracy
  3. Better Analytics

    • Historical pass rate trends
    • Test execution time tracking
    • Credit usage breakdown dashboard
  4. Multi-Browser Support

    • Chrome, Firefox, Safari, Edge testing matrix
    • Mobile browser emulation (iOS, Android)
    • Parallel execution across browsers

Medium-Term (6-12 Months)

  1. Team Collaboration

    • Shared workspaces with role-based permissions
    • Comments and annotations on test cases
    • Approval workflows for test changes
  2. Advanced AI Features

    • Natural language test creation: "Test the login flow" → generates test case
    • Auto-fix failed tests with AI-suggested patches
    • Visual regression testing with pixel-diff comparison
  3. Expanded Testing Capabilities

    • API testing (REST and GraphQL endpoints)
    • Performance testing (load times, resource usage)
    • Accessibility testing (WCAG compliance checks)
  4. Enhanced Voice Control

    • Multilingual support (Spanish, French, German, etc.)
    • Chained commands: "Run failed tests then show me the results"
    • Custom voice shortcuts

Long-Term (1-2 Years)

  1. Enterprise Features

    • SSO support (SAML, OAuth)
    • On-premise deployment options
    • Custom AI model training on organization codebases
    • SOC 2 compliance and audit logs
  2. Integration Marketplace

    • Connect to Jira, Linear, Notion, Asana
    • Slack notifications for test results
    • Webhook integrations for custom workflows
  3. Test Coverage Insights

    • Identify untested routes and components
    • Suggest new test cases based on code changes
    • Coverage reports with visualization
  4. AI-Powered Test Maintenance

    • Automatically update tests when UI changes
    • Detect duplicate or redundant tests
    • Suggest test consolidation opportunities

Monetization & Growth

  • Launch paid tiers (Pro, Team, Enterprise) with Stripe integration
  • Partner with agencies to offer managed testing services
  • Build a community around AI-powered testing best practices
  • Open-source core components to drive adoption

🎯 Our Vision

We envision a future where testing is no longer a bottleneck. Where developers ship with confidence, knowing that AI has their back. Where non-technical team members can validate product behavior without learning to code. Where failed tests aren't cryptic errors but clear, actionable insights.

Scriptless.ai is the first step toward that future.


Built With

Share this project:

Updates