Inspiration

Traditional accounts payable (AP) automation is notoriously brittle. Standard RPA bots frequently break or fail silently the moment a vendor changes an invoice layout or when data matches ambiguously, leading to costly processing backlogs. We were inspired to solve this structural vulnerability by moving away from rigid, hardcoded templates. Our goal was to design an intelligent, self-healing AP Control Tower that utilizes generative AI to act as a cognitive firewall, seamlessly handling messy, real-world finance documentation.

What it does

Core Functionality

AP Control Tower: Cognitive Guardrails converts passive, brittle accounts payable workflows into a resilient, self-healing financial security system:

Intelligent Document Ingestion: Automatically processes unstructured invoice data without relying on rigid, hardcoded templates.

Semantic Fraud Detection: Leverages Gemini 2.5 Flash to contextually cross-reference line items against master PO records and historical vendor data, calculating a dynamic Fraud Risk Score (0-100) and a natural-language Reasoning Timeline.\

Human-in-the-Loop Safeguards: If high-risk behavioral anomalies or layout discrepancies are detected, the system safely routes the case to UiPath Action Center, generating a side-by-side human validation interface.

Asynchronous Execution: Once a financial supervisor reviews and submits a resolution, a cloud webhook safely unpauses the robot to seamlessly execute down-funnel ERP data entry, protecting the corporate ledger from silent automation failures.

How we built it

We engineered a cloud-native, asynchronous architecture that bridges low-code orchestration with advanced semantic intelligence: The AI Brain: We integrated Gemini 2.5 Flash via Google Vertex AI directly into our workflow canvas to analyze unstructured invoice PDF data. Cross-Reference Engine: Instead of simple regex matching, Gemini contextually evaluates extracted data directly against a master backend database table. Human-in-the-Loop Routing: We configured a robust conditional workflow. If an invoice matches perfectly, it logs a success message; if Gemini detects a discrepancy, the robot gracefully steps aside and triggers a side-by-side validation task natively inside UiPath Action Center.

Challenges we ran into

Our biggest hurdle was handling complex background cloud telemetry. During testing, passing raw extraction objects directly into the task manager caused runtime JSON serialization blocks, throwing severe file-resource errors right at the millisecond of job completion. We systematically debugged this platform constraint by engineering robust VB.NET string sanitation expressions—using .Replace() and .Trim() routines—to strip out formatting breaks and flatten the data packet payload before it maps to the Orchestrator metadata.

What we learned

This project opened our eyes to the true power of Asynchronous Closed-Loop AI. We learned that building enterprise-grade automation isn't about creating a perfect bot that never encounters an error; it's about building an architecture that knows how to handle exceptions gracefully. By combining the processing speed of Gemini 2.5 Flash with the strategic compliance oversight of human decision-making, we proved that AI can safely scale corporate finance operations without losing human control.

What's next for AP Control Tower: Cognitive Guardrails

  1. Hardening the Platform Infrastructure (Production Transition) Resolve Vendor-Side Serialization Platform Bugs: Move the workflow from a cloud-native preview runner to a dedicated UiPath Windows-Legacy or .NET 8 Unattended Robot hosting environment. This gives you deep control over local assembly caching and fully resolves the JobAttachmentBuilder file-stream mapping errors.

Transition to Strict JSON Schema Output: Rather than parsing raw string payloads from Gemini, configure the Google Vertex AI connection block to enforce a strict structured JSON schema output (Response Schema). This eliminates the need for complex VB.NET string sanitation tricks entirely by guaranteeing the engine returns a clean, predictable key-value object every single time.

  1. Deepening the Cognitive Intelligence Layer: Multi-Modal Document Line-Item Analysis: Expand Gemini’s evaluation from processing simple text data tables to evaluating raw visual elements on the invoice, such as identifying altered typography, overlapping text, or missing corporate watermarks that signify a forged billing document.Longitudinal Vendor Behavioral Phenotyping: Connect the pipeline to your historic Google Sheet database to track invoice trends over time. This allows Gemini to calculate a moving baseline for each vendor, automatically flagging a high fraud score if an invoice suddenly bypasses standard payment routing or introduces unusual, sudden line-item cost spikes.

  2. Creating a Closed-Loop Machine Learning Feedback System Action Center Correction Learning: Implement a data feedback loop where, whenever a supervisor manually overrides or corrects a discrepancy inside the UiPath Action Center interface, that human correction data is captured. Automated Prompt Fine-Tuning: Use those human-validated edge cases to dynamically adjust the system's prompt context window, continuously lowering false-positive flags and sharpening the accuracy of your autonomous financial control tower over time.

Built With

  • gemini2.5flash
  • googlecloudvertexai
  • json
  • uipathactioncenter
  • uipathstudioweb
  • vb.net
Share this project:

Updates