Inspiration
Picture Hannah, an accounts-payable analyst. A supplier invoice lands: material RM27, 50 pieces at £27.50; material RM16, 6 pieces at £2.00. She can't pay it until she answers one question - does this match the purchase order of record in SAP? So she opens S/4, finds PO 4500000021, and reads down the lines by hand. The PO says £25.00, not £27.50 - a 10% price increase nobody flagged. It says quantity 5, not 6 - the supplier over-delivered. Now she has to decide: absorb it, query it, or renegotiate. Multiply that by a few hundred invoices a week.
That reconciliation is judgment work, and it's slow precisely because the truth lives in SAP and the claim lives in the invoice, and only a person currently bridges them. Most "agentic" demos paper over this by reconciling against synthetic data - two made-up numbers the demo author already knows agree. That proves nothing. The hard, real, valuable part is reasoning against the actual system of record at runtime, and doing it in a way a finance team would actually let near their ledger.
So we built the opposite of a synthetic demo: an agent that reaches into the live PO it has never seen, and a governance spine that makes it structurally incapable of writing back on its own.
What it does
Every capability below maps to code in the repo, and each is labelled live, mocked, or aspirational so there's no ambiguity about what runs today.
Reconciles against a real SAP PO it reads at runtime - LIVE. The variance agent (variance-agent/main.py) is a single-node LangGraph graph: a bounded MCP tool-calling loop plus structured output. It is fed only the supplier's numbers. It pulls the real purchase order from SAP S/4HANA Cloud over MCP (variance-agent/mcp_client.py, tool execute-entity-operation against API_PURCHASEORDER_PROCESS_SRV / A_PurchaseOrderItem), then reports the PO side - unit price £25.00, quantity 5 - values it could only know by reading S/4 live. It classifies the discrepancies (Item 10: price-variance +10%; Item 20: over-delivery +20%), scores its confidence (0.95), and prepares the exact S/4 corrections (NetPriceAmount 25→27.50, OrderQuantity 5→6) - marked ready, but held. This ran as a real UiPath Orchestrator serverless agent job (job dbedd8aa-2429-4854-a0bd-6dedaa62101b, Successful, ~64s, 2026-06-29).
Stays read-only on SAP by construction - LIVE. The agents' system prompts forbid writing (variance-agent/main.py: "you NEVER write to SAP and you NEVER post"). The only write path in the entire system is one deterministic, post-approval executor (variance-agent/post_correction.py) the agents never call. Read is live; write is held ("armed, not fired" - see Challenges).
Orchestrates three coded agents under a governance spine - BUILT, VALIDATED, DEPLOYED. The Maestro BPMN (ExchangeReconSolution/ExchangeReconBpmn/ExchangeReconBpmn.bpmn) wires three live LangGraph agents - matching (matching-agent/main.py), variance (variance-agent/main.py), posting-prep (posting-prep-agent/main.py) - via StartAgentJob nodes bound by release key (bindings_v2.json). Each has now run as its own Successful Orchestrator serverless job reading PO 4500000021 from real S/4 over MCP, co-located in Shared/ExchangeReconDemo (matching 750a5c3e, ~52s; variance c51ac7fa, ~126s; posting-prep d7b8891e, ~52s) - three separate jobs, not yet one Maestro instance end-to-end (see Challenges). Between the agents sits a deterministic JavaScript tolerance check (Task_Tolerance: priceTolerancePct = 2.0, qtyTolerance = 1.0), a human approval gate, exclusive gateways, and error/escalation boundary events. The split is the whole point: a deterministic rule decides what counts as a variance, the agent supplies judgment and explanation, and a human holds authority.
Posts the correction back to S/4 - MOCKED / aspirational. The BPMN's "Update PO item in S/4" node (Task_UpdatePO) is a script returning a confirmation string. The real PATCH lives in post_correction.py and is currently blocked upstream (see Challenges), so the S/4 write is not live today.
Gives operations a cockpit - LIVE (two surfaces). The default Reconciliation view (src/lib/reconDemo.ts) is a self-contained, offline rendering of the agents' actual captured outputs - 100% reliable for a demo, no network. A second "Live tenant" tab uses the real @uipath/uipath-typescript SDK (src/lib/sdk.ts, src/lib/exchange.ts) to list Maestro instances and complete the human-gate task over OAuth PKCE - real code, gated on operator login and config. The "live S/4 · via MCP" badges on the demo tab are static labels over seeded data; the cockpit itself does not call SAP.
How we built it
Problem-first, but here's the spine that makes it real.
UiPath Maestro BPMN (Track 2) is the orchestration backbone - the process model that sequences the agents, runs the deterministic tolerance gate, branches on the gateways, and routes escalations. It's the hero: governed agency expressed as a process, not a prompt.
UiPath Coded Agents (LangGraph / uipath-langchain) are the three reasoning units - bounded MCP tool-calling loops with strict structured output (Pydantic models in each main.py). The LLM is Azure OpenAI gpt-4o-2024-11-20 via the UiPath LLM Gateway (UiPathAzureChatOpenAI), so there's no raw model key in the loop.
MCP is what makes the reasoning honest. Instead of being wired to a hard-coded endpoint, the agent talks to a BTP-hosted SAP OData→MCP server and operates the right service and entity through execute-entity-operation (mcp_client.py). Auth is XSUAA client-credentials; config resolves from environment locally and from UiPath Orchestrator assets when deployed (so a deployed agent with no .env still authenticates).
UiPath Orchestrator runs the agents as serverless jobs - that's where the live SAP read actually executed. UiPath Action Center / a Maestro message human-gate is the approval seam; the cockpit can complete it via the SDK (decideGate in src/lib/exchange.ts).
The cockpit is Vite + React + TypeScript, talking to the tenant through the @uipath/uipath-typescript SDK.
The system of record is SAP S/4HANA Cloud on SAP BTP, over OData (API_PURCHASEORDER_PROCESS_SRV / A_PurchaseOrderItem), secured with XSUAA.
And the build tool itself: the whole thing was assembled with Claude Code via UiPath for Coding Agents - spec-driven, small verified commits, the agent reading the SDK instead of guessing at it.
Challenges we ran into
These are the real ones, not the polished ones.
Binding StartAgentJob to real agents by release key. This morning the Maestro BPMN was a placeholder stub - never validated, not bound to anything. Turning it into a genuine orchestrator meant giving each StartAgentJob node the correct uipath:bindings process binding and the deployed agent's release key (bindings_v2.json, the ReleaseKey references in ExchangeReconBpmn.bpmn). Get a key wrong and the node points at nothing. Getting all three right is what moved the file from "stub" to "validated and deployable."
A deploy-time folder/packaging collision. Packing and publishing the solution didn't go clean on the first pass - the project layout collided at pack/deploy time and had to be reconciled before uip solution pack --dry-run returned Valid and the solution would publish. (Process detail, not a capability claim; the verifiable end state is the passing dry-run and the deployed solution.)
The SAP write-back is blocked by an upstream MCP bug - and we diagnosed it precisely. Every key-based operation (update/read-single) 404s because the MCP server returns keyProperties: [] for A_PurchaseOrderItem. Root cause: a JSDOM CSS-selector bug in the server's metadata parser - a mixed-case compound selector that silently matches nothing, so the entity key never gets extracted and the OData key URL can't be built. Collection reads with $filter don't need the key, which is exactly why everything reads but can't write. Full diagnosis and the one-line upstream fix are in .agent/submission/writeback-plan.md; our side (post_correction.py) is written and lands unchanged the moment the server exposes the key. Hence "armed, not fired."
The runtime-capacity wall - the honest ceiling. The deployed 3-agent Maestro instance launches on the tenant, and an earlier version executed the start event and reached the matching-agent task. But the full end-to-end run does not complete green: the staging tenant has no allocated ProcessOrchestration/Agent runtime, so the instance hangs Pending. We will not dress this up. The honest line is: the agents are live on real SAP; the 3-agent Maestro is built, validated, and deployed, and an instance launches; running it to completion needs allocated agent runtime. We never claim the full Maestro ran end-to-end or that the gate "cleared" in a real procurement run.
Accomplishments that we're proud of
These are the real ones, not the polished ones.
Binding StartAgentJob to real agents by release key. This morning the Maestro BPMN was a placeholder stub - never validated, not bound to anything. Turning it into a genuine orchestrator meant giving each StartAgentJob node the correct uipath:bindings process binding and the deployed agent's release key (bindings_v2.json, the ReleaseKey references in ExchangeReconBpmn.bpmn). Get a key wrong and the node points at nothing. Getting all three right is what moved the file from "stub" to "validated and deployable."
A deploy-time folder/packaging collision. Packing and publishing the solution didn't go clean on the first pass - the project layout collided at pack/deploy time and had to be reconciled before uip solution pack --dry-run returned Valid and the solution would publish. (Process detail, not a capability claim; the verifiable end state is the passing dry-run and the deployed solution.)
The SAP write-back is blocked by an upstream MCP bug - and we diagnosed it precisely. Every key-based operation (update/read-single) 404s because the MCP server returns keyProperties: [] for A_PurchaseOrderItem. Root cause: a JSDOM CSS-selector bug in the server's metadata parser - a mixed-case compound selector that silently matches nothing, so the entity key never gets extracted and the OData key URL can't be built. Collection reads with $filter don't need the key, which is exactly why everything reads but can't write. Full diagnosis and the one-line upstream fix are in .agent/submission/writeback-plan.md; our side (post_correction.py) is written and lands unchanged the moment the server exposes the key. Hence "armed, not fired."
The runtime-capacity wall - the honest ceiling. The deployed 3-agent Maestro instance launches on the tenant, and an earlier version executed the start event and reached the matching-agent task. But the full end-to-end run does not complete green: the staging tenant has no allocated ProcessOrchestration/Agent runtime, so the instance hangs Pending. We will not dress this up. The honest line is: the agents are live on real SAP; the 3-agent Maestro is built, validated, and deployed, and an instance launches; running it to completion needs allocated agent runtime. We never claim the full Maestro ran end-to-end or that the gate "cleared" in a real procurement run.
Accomplishments that we're proud of
All three agents reason over the real SAP system of record - and we can prove it. Each ran as its own Successful Orchestrator job reading PO 4500000021 from live S/4 over MCP: matching read the PO lines (£25.00/£2.00, qty 50/5, confidence 1.0; job 750a5c3e); variance was given only the supplier's invoice and returned PO-side values (£25.00, qty 5) it could not have known without reading live S/4 at runtime - classifying price-variance +10% / over-delivery +20% (variance-agent/main.py, mcp_client.py; job c51ac7fa); and posting-prep read the current OrderQuantity (5) and prepared the exact A_PurchaseOrderItem 5→6 update, held (job d7b8891e). That's the difference between a demo and a system: there was no synthetic ground truth to lean on.
Placeholder stub → real-bound, validated, deployed Maestro in a single pass. The BPMN went from an unvalidated stub to a process genuinely bound to three deployed agents by release key, passing uip solution pack --dry-run (Valid) and published live - ExchangeReconBpmn.bpmn + bindings_v2.json tell that story.
Governed agency as the architecture, not a disclaimer. Deterministic check (Task_Tolerance), agent judgment (the LangGraph agents), human authority (the gate) - kept structurally separate, with the agents read-only on SAP by prompt and the single write path quarantined to a post-approval executor.
What we learned
MCP is the right abstraction for letting an agent reason over an enterprise system of record. A discovery-driven OData server lets the agent find and operate the correct service and entity rather than being soldered to one endpoint - and that flexibility is what makes the reasoning real instead of staged.
The interesting question isn't "can the agent do it" - it's "what is the agent allowed to do." In a real procure-to-pay shop, an agent that cannot write to SAP is more deployable than one that can. Governance is a feature, not a tax.
Deploying changes the claim. "Runs on the platform, against the real system" is a different sentence from "runs on my laptop," and the constraints we hit - release-key binding, packaging, runtime allocation - are exactly the ones that separate a prototype from something an enterprise would actually adopt.
What's next for Exchange Recon Cockpit
A roadmap, clearly labelled as not-yet-done:
- Allocate Agent runtime to a shared folder and run all three agents as one live Maestro instance end-to-end - each agent has already run live as its own Orchestrator job; the orchestration is built and validated today, so this is a capacity allocation, not a rebuild.
- Unblock the write-back - land the one-line upstream MCP fix (
writeback-plan.md) so an approved correction actually PATCHes the live PO, then capture the approve→post→read-back loop. - Widen to a goods-receipt three-way match, still read-only on the agent side.
- Confidence-routed autonomy - auto-clear high-confidence, within-policy lines; reserve the human gate for genuine judgment calls.
- Move tolerance and approval policy into Data Fabric instead of code.
Built With
- agents
- btp
- claude
- client-credentials
- cloud
- coded
- context
- gateway
- human-gate
- langchain-mcp-adapters
- llm
- maestro
- model
- openai
- orchestrator
- protocol
- react
- s/4hana
- sdk
- typescript
- uipath
- uipath-coded-agents
- vite
- xsuaa

Log in or sign up for Devpost to join the conversation.