Inspiration

In a world where automation is key to productivity, we noticed a significant gap. While powerful automation tools exist, they often come with steep learning curves, requiring users to learn complex scripting languages or navigate intricate interfaces. We were inspired by the simplicity of natural language and the power of modern AI. We asked ourselves: "What if anyone could automate their digital tasks just by describing them in plain English?" This question led to the birth of GeniusQA, a tool designed to make automation accessible to everyone, from casual users to power users, by turning simple conversations into powerful actions.

What it does

GeniusQA is a lightweight desktop application for Windows and macOS that acts as your personal automation assistant. Through a minimalist GUI, users can instruct a Large Language Model (LLM) to perform tasks on their computer.

Here’s what it can do:

  • Control Mouse and Keyboard: It can move the mouse, click, scroll, type text, and press keyboard shortcuts, automating repetitive actions across any application.
  • Natural Language to Script: Users simply type a command like, "Open Chrome, go to google.com, and search for the latest news." GeniusQA translates this into an executable script.
  • Record and Capture: The application includes built-in tools to record a video of the automation process or take screenshots at specific steps, making it easy to document or share workflows.
  • Simple Interface: It features a small, always-on-top window with a chat interface and a clean toolbar containing just three buttons: New, Save Video, and Screen shots.

How we built it

GeniusQA is built on a modern, multi-process architecture to ensure both a responsive user experience and powerful backend functionality.

  • Frontend (UI): We used React Native for Windows and macOS to create a native, high-performance, and visually consistent user interface. This choice allows us to maintain a single codebase for both platforms while avoiding the overhead of web-based frameworks like Electron.
  • Automation Core: The heart of the automation is a Python process. We leveraged the robust pyautogui library to handle all mouse and keyboard interactions. This core runs in the background, listening for commands.
  • Backend Server: A Node.js server acts as the central nervous system. It manages user requests from the React Native client, communicates with the LLM API (like OpenAI's GPT or Google's Gemini) to translate natural language into scripts, and will handle user data.
  • Database: We chose Firebase for its real-time capabilities and ease of setup, allowing us to quickly implement features like chat history synchronization. MySQL is planned for future, more structured data needs.
  • Inter-Process Communication (IPC): To connect the React Native frontend with the Python core, we established an IPC channel, enabling the UI to send executable commands to the automation engine seamlessly.

Challenges we ran into

One of the biggest challenges was designing a "prompt engineering" strategy for the LLM. It was crucial that the AI not only understood the user's intent but also consistently returned a machine-readable script (e.g., a JSON object or a Python code snippet) that our automation core could execute without errors. This required extensive testing and refinement of the instructions we send to the AI.

Another hurdle was creating a reliable IPC bridge between the JavaScript-based React Native environment and the Python process, especially in a way that was cross-platform compatible and performant.

Accomplishments that we're proud of

We are incredibly proud of creating a fully functional prototype that validates our core concept: turning natural language into desktop automation. The current version successfully translates user commands into precise mouse and keyboard actions.

We are also proud of the minimalist and intuitive design. By stripping away all non-essential elements, we've created an experience that is truly "plug-and-play," requiring virtually no learning curve. The successful integration of React Native for Desktop with a Python backend is a significant technical achievement for our team.

What we learned

This project reinforced the power of a modular architecture. By separating the UI, the AI logic, and the automation engine, we were able to develop, test, and debug each component independently. We also learned a great deal about the nuances of prompt engineering and the importance of defining a strict data contract between the LLM and our application code. Furthermore, we gained valuable experience in cross-platform desktop development and the intricacies of inter-process communication.

What's next for GeniusQA

The future for GeniusQA is bright, and we have a clear roadmap ahead:

  • Enhanced AI Capabilities: We plan to train a fine-tuned model to better understand context, handle more complex, multi-step commands, and even learn from user corrections.
  • Introducing "Vision": We will integrate computer vision capabilities, allowing GeniusQA to not just operate on coordinates but to "see" and interact with UI elements like buttons and text fields by name (e.g., "Click the 'Submit' button").
  • Community Script Library: We envision a platform where users can save, share, and download automation scripts, creating a collaborative ecosystem of workflows.
  • Expanding Integrations: We aim to add native integrations with popular apps and services through APIs, combining UI automation with direct data manipulation for even more powerful workflows.

Built With

Share this project:

Updates

posted an update

Task 14 Implementation Summary I have successfully implemented Task 14: Create comprehensive integration tests for the AI Test Case Generator. Here's what was accomplished:

Integration Tests Created Complete Requirements Processing Workflow Test

Tests the end-to-end workflow from requirements input to test case generation Validates input processing, configuration management, and monitoring integration Verifies generation options and performance tracking Complete Action Log Documentation Workflow Test

Tests conversion of recorded actions to human-readable documentation Validates action format compatibility and metadata preservation Ensures integration with existing GeniusQA systems Error Handling Across System Boundaries Test

Tests error scenarios across different system components Validates error logging and monitoring integration Ensures graceful handling of configuration and validation errors Concurrent Operations and Performance Test

Tests system performance under concurrent load Validates that operations don't block each other Ensures proper async operation handling and timeout management System Integration with Existing GeniusQA Components Test

Tests compatibility with Desktop Recorder integration Validates test case management workflow integration Ensures AI Script Builder pattern compatibility End-to-End Workflow with Timeout Handling Test

Tests complete workflows with proper timeout enforcement Validates error handling and performance monitoring Ensures operations complete within expected timeframes Property-Based Integration Workflow Consistency Test

Uses property-based testing to verify workflow consistency across different inputs Tests with various complexity levels, project types, and generation options Validates monitoring and error handling consistency Key Features of the Integration Tests Comprehensive Coverage: Tests cover all major workflows and system boundaries Real System Integration: Tests use actual service instances and configuration managers Error Scenario Testing: Validates proper error handling and recovery mechanisms Performance Validation: Ensures operations complete within reasonable timeframes Concurrent Operation Testing: Verifies system behavior under concurrent load Property-Based Testing: Uses randomized inputs to test workflow consistency Test Results All 9 integration tests are now passing successfully:

✅ Integration test helper creation ✅ Complete requirements workflow ✅ Complete action log workflow ✅ Error handling across boundaries ✅ Concurrent operations performance ✅ System integration with GeniusQA ✅ End-to-end workflow with timeouts ✅ Property-based workflow consistency ✅ Setup test environment The integration tests provide comprehensive validation of the AI Test Case Generator's functionality and ensure it integrates properly with the existing GeniusQA system architecture.

Log in or sign up for Devpost to join the conversation.

posted an update

The AI Test Case Generator is a Rust-based backend service integrated into GeniusQA's Tauri desktop application that leverages Google Gemini API to automatically generate comprehensive test case documentation. The system provides two primary workflows: converting natural language requirements into structured test cases, and transforming recorded automation logs into human-readable test documentation.

The design follows GeniusQA's existing architecture patterns, utilizing Rust for type-safe backend processing, Tauri for frontend-backend communication, and React for the user interface. The system integrates seamlessly with existing AI services while providing enhanced documentation capabilities specifically focused on test case generation rather than automation script creation.

Log in or sign up for Devpost to join the conversation.