AEGIS

Intro page with skip.
Intro page scrolled.
Landing page awaiting user instructions in natural language.
Task in progress page.
Tasks execution history.
Task tracking and subtask loging.
Swagger Documentation for the API

Inspiration

I was inspired by the reliability and precision of industrial software systems and the possibility of extending them with intelligent capabilities without altering their core functionality. The recent progress in AI agents and large language models made it clear that natural language could become a practical interface for automation. Kiro also influenced the direction of the project. After observing how well it supports organized, testable development for complex engineering work, it became the backbone of the build.

What it does

AEGIS is a cognitive, intent-driven RPA system that interprets natural language instructions and executes desktop automation tasks. It uses an intelligent backend to analyze user intent and a desktop agent layer to perform operations on existing software. The system allows users to command and monitor automation processes through a Flutter desktop interface.

How we built it

AEGIS consists of two main components:

Backend: aegis-back

Built with FastAPI as the central orchestration service
Uses the Google Agent Development Kit (ADK) with Gemini to interpret natural language and plan actions
Relies on PyAutoGUI and pywinauto for deterministic desktop automation
Contains an internal workflow layer that converts interpreted intent into GUI-level actions

Frontend: aegis-front

Implemented using Flutter for desktop
Provides a simple and intuitive interface for issuing commands and observing execution states in real time
Communicates with the backend through a clean API layer

Kiro was used throughout planning and development to manage the workspace, maintain structure, and streamline debugging.

Challenges we ran into

Hook execution within Kiro often blocked new processes until each hook completed, which slowed development cycles
Coordinating multiple layers of the system (intent parsing, backend logic, RPA execution) required careful architectural planning
Desktop automation introduced challenges related to screen state, application focus, and repeatability
Ensuring that natural language commands produced consistent, reliable actions required ongoing refinement

Accomplishments that we're proud of

Successfully integrated LLM-powered intent interpretation with traditional RPA techniques
Built a stable, working prototype capable of automating real desktop applications through natural language
Established a clean modular structure that separates interpretation, orchestration, and execution
Demonstrated that AI-driven RPA can extend existing software without altering or instrumenting it

What we learned

How to combine LLM-driven reasoning with deterministic automation in a safe and controlled way
The importance of isolating the interpretation layer from the execution layer
Practical lessons on orchestrating agents using ADK and Gemini
Techniques for improving reliability in GUI automation workflows
How to design an automation system that remains predictable even when driven by natural language

What's next for AEGIS

Expanding AEGIS into a multi-agent architecture with parallel task execution
Adding self-correction capabilities where the agent adjusts actions based on visual or state feedback
Improving robustness by incorporating computer vision for screen understanding
Extending the Flutter app into a full command center with task templates, logs, and replay features
Exploring secure enterprise integrations to connect AEGIS with real industrial workflows

Built With

adk
dart
fastapi
flutter
material3
pyautogui
python
pywin32
uvicorn

Updates

Marble Eagle started this project — Dec 05, 2025 04:52 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.