Inspiration

I was inspired by the reliability and precision of industrial software systems and the possibility of extending them with intelligent capabilities without altering their core functionality. The recent progress in AI agents and large language models made it clear that natural language could become a practical interface for automation. Kiro also influenced the direction of the project. After observing how well it supports organized, testable development for complex engineering work, it became the backbone of the build.

What it does

AEGIS is a cognitive, intent-driven RPA system that interprets natural language instructions and executes desktop automation tasks. It uses an intelligent backend to analyze user intent and a desktop agent layer to perform operations on existing software. The system allows users to command and monitor automation processes through a Flutter desktop interface.

How we built it

AEGIS consists of two main components:

Backend: aegis-back

  • Built with FastAPI as the central orchestration service
  • Uses the Google Agent Development Kit (ADK) with Gemini to interpret natural language and plan actions
  • Relies on PyAutoGUI and pywinauto for deterministic desktop automation
  • Contains an internal workflow layer that converts interpreted intent into GUI-level actions

Frontend: aegis-front

  • Implemented using Flutter for desktop
  • Provides a simple and intuitive interface for issuing commands and observing execution states in real time
  • Communicates with the backend through a clean API layer

Kiro was used throughout planning and development to manage the workspace, maintain structure, and streamline debugging.

Challenges we ran into

  • Hook execution within Kiro often blocked new processes until each hook completed, which slowed development cycles
  • Coordinating multiple layers of the system (intent parsing, backend logic, RPA execution) required careful architectural planning
  • Desktop automation introduced challenges related to screen state, application focus, and repeatability
  • Ensuring that natural language commands produced consistent, reliable actions required ongoing refinement

Accomplishments that we're proud of

  • Successfully integrated LLM-powered intent interpretation with traditional RPA techniques
  • Built a stable, working prototype capable of automating real desktop applications through natural language
  • Established a clean modular structure that separates interpretation, orchestration, and execution
  • Demonstrated that AI-driven RPA can extend existing software without altering or instrumenting it

What we learned

  • How to combine LLM-driven reasoning with deterministic automation in a safe and controlled way
  • The importance of isolating the interpretation layer from the execution layer
  • Practical lessons on orchestrating agents using ADK and Gemini
  • Techniques for improving reliability in GUI automation workflows
  • How to design an automation system that remains predictable even when driven by natural language

What's next for AEGIS

  • Expanding AEGIS into a multi-agent architecture with parallel task execution
  • Adding self-correction capabilities where the agent adjusts actions based on visual or state feedback
  • Improving robustness by incorporating computer vision for screen understanding
  • Extending the Flutter app into a full command center with task templates, logs, and replay features
  • Exploring secure enterprise integrations to connect AEGIS with real industrial workflows

Built With

Share this project:

Updates