Training │ Fleet and Cloud Resource Manager

### 1. Elevator Pitch (Slogan) — 194
Karakter (Limit < 200)

Cloud GPU orchestrator and humanoid

training manager on UiPath Maestro BPMN. Automates ML on AWS/GCP/AMD, monitors anomalies to prevent idle spend, and routes checkpoints via HITL Action Center. ────── ### 2. Built With (Kullanılan
Teknolojiler Etiketleri) uipath,uipath-maestro,uipath-agent- builder,uipath-action-center,bpmn, agentic-ai,cloud-gpu,humanoid-robotics, reinforcement-learning,mujoco,aws,gcp, amd-developer-cloud,python,docker,claude, gemini ────── ### 3. Project Story (Proje Hikayesi) —
Markdown formatında Aşağıdaki metni kopyalayıp Devpost'taki Project Story alanına doğrudan yapıştırabilirsiniz: ## Inspiration

Training advanced neural skills (such

as walking, running, or grasping) for bipedal humanoid robots (like the Unitree G1) in reinforcement learning (RL) simulation environments is incredibly compute-heavy and expensive. ML research teams waste thousands of dollars renting cloud GPU instances (AWS, GCP, AMD Developer Cloud) that run completely idle, or running RL training jobs that secretly diverge or collapse early without anyone noticing for hours.

Managing these asynchronous simulation

containers, monitoring live training telemetry, and protecting cloud budgets is a massive operational headache. UiPath AgentHack gave us the perfect opportunity to bring order to this chaos—building an orchestration backbone on UiPath Maestro autonomous, governed, and agentic BPMN to automate and optimize the entire

humanoid training lifecycle. ## What it does

**Training Fleet and Cloud Resource

Manager** is an autonomous cloud GPU orchestrator and humanoid RL training lifecycle manager built end-to-end on UiPath Maestro BPMN.

The real-world operational flow

operates as follows:

1. **Training Ingestion:** A robotics

run (e.g., walking) via a simple form, researcher requests a new skill training specifying target metrics. 2. Cloud Resource Check: An RPA robot queries live APIs across AWS, GCP, and AMD Developer Cloud to locate the most cost-effective and available high- performance GPUs (like AMD MI300X or NVIDIA H100). 3. Container Deployment: Once selected, the system spins up the simulation container (g1-mujoco-rl- training) on the remote cloud instance via secure SSH. 4. Telemetry & AI Anomaly Detection: The system continuously monitors simulation logs and reward curves. If an AI Quality Agent detects a training anomaly (e.g., gradient collapse, loss divergence, motor heat spikes, or continuous falling loops), it triggers an emergency SSH shutdown to instantly stop wasting expensive GPU dollars and fires an alert. 5. HITL Checkpoints & Action Center: Successful checkpoints or completed models generate an interactive card in UiPath Action Center. The researcher reviews the final reward curves and joint heat maps to approve the model, which triggers weight archival, cloud-instance teardown, and resource release. 6. Governance & Audit Logs: A complete, auditable log of GPU utilization, training efficiency, and budget consumption is archived automatically.

## How we built it

**Stack:**

- **UiPath Maestro BPMN** — the

orchestration core coordinating simulation containers as subprocesses and cloud providers as swimlanes. - UiPath Agent Builder — powers the Telemetry Analyst Agent (watching logs and rewards for anomalies) and the Budget/Risk Agent. - UiPath Action Center — interactive human-in-the-loop (HITL) dashboards for researchers to review reward curves and approve weight registration. - RPA Robots & API Workflows — queries, and container SSH management automates cloud provisioning, instance across AWS, GCP, and AMD. - Docker & MuJoCo / RL Environment — the remote bipedal humanoid walking training simulation containers (g1- mujoco-rl-training). - Claude (primary) + Gemini (fallback) — LLM providers powering the log analysis and reasoning behind training health classification.

## Challenges we ran into

- **Standardizing Telemetry:**

Translating complex, high-frequency reinforcement learning simulation logs (MuJoCo/Stable-Baselines3) into a structured JSON schema that UiPath Maestro can seamlessly ingest and parse. - Low-Latency Terminations: Designing a highly reliable, instant SSH action pipeline to terminate cloud instances the moment training diverges, preventing runaway billing on expensive hardware. - Action Center Visualizations: Creating rich, lightweight data visualizations (reward curves, joint heat maps) that fit beautifully into UiPath Action Center for rapid researcher decision-making.

## Accomplishments that we're proud of

- **High-Performance Robotics meets

Enterprise BPMN:** Successfully bridging cutting-edge humanoid reinforcement learning pipelines with enterprise-grade process orchestration on UiPath Maestro BPMN. - Active Cost Containment: Realizing an autonomous "active cost containment" loop that can save robotics labs thousands of dollars in wasted cloud GPU hours by terminating bad runs instantly. - Generic, Anonymous Open Source: Building a fully generic, public, and open-source blueprint with no commercial claims or IP restrictions, making cloud GPU management accessible to any robotics lab.

## What we learned

- **BPMN for ML Pipelines:** BPMN 2.0

is exceptionally well-suited for long- running asynchronous ML jobs, which traditionally rely on fragile, custom- written shell scripts. - The True Power of HITL: Human-in- the-loop is vital in robotics, where AI- generated models must be verified against physics constraints before being pushed to physical hardware.

## What's next

- **Decentralized Compute

Integration:** Expand cloud integrations to include decentralized GPU networks (like Vast.ai, Akash) for even cheaper training. - Multi-Agent Co-Training: Support multi-agent reinforcement learning (MARL) where fleets of humanoid robots co-train in parallel simulation instances. - Multi-Framework Orchestration: Transition the prototype into an open- source, generic cloud orchestrator for any compute-heavy training workloads beyond robotics. ──────

Built With

aerospace
agentic-ai
anthropic
bpmn
claude
cmii
ecm
gemini
json-schema
langchain
manufacturing
pharma
plm
python
regulated-industry
uipath
uipath-action-center
uipath-agent-builder
uipath-maestro

Created by

I designed and built the project end-to-end: the UiPath Maestro BPMN orchestration (8 containers, 9 roles), the Agent Builder + coded analyst/risk/impact agents, and the live Orchestrator queue integration. First time modeling long-running ML as a governed BPMN process — learned a ton about gateways, HITL with Action Center, and wiring coded agents under low-code governance.

Forenly AI
The Skill Layer for Humanoids
"I developed critical backend integrations and queue consumption logic for our agentic orchestrator. Specifically, I wired our analyst agents to live Claude and Gemini APIs, built comprehensive test coverage, and implemented the CRM supplier lookup for the action stage. Additionally, I built the UiPath Orchestrator queue consumer—extending the API client with get/set transaction methods and writing the engine that dequeues items, processes them through the analyst AI, and updates their success/failure status in UiPath."

Swati Gupta
I was responsible for the UiPath Orchestrator live integration of the project.

My contributions included:

- Configured and validated the live UiPath Automation Cloud environment.
- Set up OAuth 2.0 client credentials authentication using a Confidential Application.
- Configured the required Orchestrator environment variables (organization, tenant, folder, client ID, and client secret).
- Integrated the Python pipeline with UiPath Orchestrator Queues.
- Successfully connected the application to the live IncidentReports queue.
- Tested and verified that anomaly events are pushed successfully from the Python application into the live UiPath queue.
- Validated the end-to-end workflow by confirming that the generated incident transactions appear in UiPath Orchestrator.
- Assisted in debugging authentication, organization configuration, folder selection, and queue integration issues during development.
- Performed end-to-end testing of the project to ensure the complete pipeline worked correctly with the live cloud environment.

This work enabled the project to move beyond local execution and communicate with a real UiPath Automation Cloud instance, demonstrating a production-style orchestration workflow.

LAWAN HARUNA DUNDU
Diya Majee

Updates

Forenly AI started this project — Jun 29, 2026 02:12 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.