The Inspiration
I've watched talented people spend the majority of their time not building AI — but preparing data for it. Cleaning columns, hunting for PII, fixing imbalances, running the same quality checks over and over on slightly different datasets. It's the unglamorous work that sits between raw data and real intelligence, and it was crying out for automation.
But not just automation. Reasoning.
Traditional pipelines execute fixed workflows. They don't push back. They don't catch their own mistakes. They hand you bad output and move on. I wanted to build something that actually thinks about what it's doing — and URIS is my answer to that.
What I Built
URIS is a multi-agent orchestration platform that autonomously diagnoses, fixes, and validates datasets for AI readiness. Five specialized agents powered by Amazon Nova 2 Lite work in sequence — and sometimes in opposition. Unlike orchestration pipelines where agents execute steps in order, URIS agents evaluate each other's outputs, reject failing strategies, and force the Planner to revise its approach. That distinction — between executing and reasoning — is the core of what URIS is
The core architecture is a three-tier system:
- Frontend — Next.js 14 dashboard with real-time pipeline visualization
- Backend — NestJS REST API with PostgreSQL for run persistence
- Agents Microservice — Python FastAPI service housing all five agents
The agents communicate through structured JSON messages containing confidence scores, risk assessments, and natural language reasoning traces. Every decision is logged. Nothing is hidden.
The ADFI Score
One of my favorite innovations in URIS is the Autonomous Data Fitness Index — a single score that tells you exactly how AI-ready your dataset is. It's a weighted combination of:
$$ADFI = w_1(Completeness) + w_2(Uniqueness) + w_3(Balance) - w_4(PrivacyRisk) - w_5(CorrelationDrift)$$
A score above 0.9 means your data is ready. Below 0.7 means the agents have significant work to do.
The Compliance Policy Engine
One thing I'm particularly proud of is the policy rule builder. Users can define compliance rules in a visual interface — Block, Mask, Flag, Generalise, or Drop — targeting specific PII types with conditional logic. These compile into executable policy directives that the Compliance Agent enforces before any data is modified.
When I ran the Titanic dataset through URIS, it caught the Name column as a direct identifier with 99% confidence and flagged five other columns for re-identification risk. That's not a programmed rule — that's the agent reasoning about privacy exposure.
How I Built It
I started with the agents microservice because that's where the intelligence lives. Each agent has a clear job:
- Planner — decomposes the task, defines constraints, routes work
- Evaluator — calculates quality metrics and ADFI
- Compliance — scans for PII, enforces policy rules
- Synthesizer — generates statistically similar synthetic data using SDV
- Validator — approves or rejects synthesis output
Amazon Nova 2 Lite powers all the reasoning. The choice was deliberate — constraint-aware strategy revision requires a model that can hold multiple competing objectives simultaneously, evaluate an output against those constraints, and generate a structurally different alternative when the first approach fails. That is not prompt chaining. That is reasoning. Nova 2 Lite's extended thinking capabilities are what make the rejection-revision cycle possible rather than just the happy path
The frontend was built to surface the intelligence visually — real-time agent status cards, live reasoning output in a terminal-style view, correlation drift charts, and a risk summary panel that updates as each agent completes.
The Challenges
Synthesis quality was the hardest problem. GaussianCopula — the default strategy — works poorly on small tabular datasets with heavy categoricals. My first runs produced hundreds of exact row matches between real and synthetic data, which is a privacy violation. I had to build post-generation deduplication, switch strategies based on dataset characteristics, and implement multiple validation checks before the output was trustworthy.
The correlation baseline problem caught me off guard. I was computing correlation drift against the original schema, but imputation and column drops change the schema mid-pipeline. The comparison was invalid. Fixing it meant recomputing the baseline after every transformation step.
Closed-loop revision was architecturally harder than I expected. Getting agents to reject outputs and feed structured reasoning back to the Planner — in a way that actually changed the next strategy rather than just retrying the same approach — required careful message design and state management across the pipeline.
Key Engineering Decisions
Why structured JSON agent communication. The naive approach passes full agent output directly to the next agent. On a medium dataset, the Evaluation Agent alone produces enough output to bloat context significantly by the time the Validator runs. URIS extracts a compact typed handoff object at each step — only the fields the next agent actually needs. This eliminated hallucinations caused by the model contradicting its earlier reasoning when context grew too large.
Why GaussianCopula first, CTGAN on failure. GaussianCopula is faster and works well on numeric-heavy datasets. For categorical-heavy datasets like Titanic, it memorizes rather than generalizes — producing exact row matches that violate privacy thresholds. The Validator catches this and the Planner revises to CTGAN, which handles categorical distributions better at the cost of longer synthesis time. The choice is deliberate: attempt the cheaper strategy first, escalate on failure.
Why the correlation baseline recomputes after imputation. Computing drift against the original schema after imputation and column drops produces invalid comparisons — the matrices have different shapes. URIS recomputes the baseline correlation matrix after every transformation step, ensuring drift is measured against what the data actually looked like at that point in the pipeline, not what it looked like at upload.
Why a custom policy DSL over hardcoded rules. Hardcoded compliance rules would make URIS brittle — every new regulation or organizational policy would require a code change. The policy rule builder compiles user-defined directives into executable policy objects that the Compliance Agent enforces. The same engine that enforces GDPR today can enforce a custom internal data governance policy tomorrow without touching the agent code.
What I Learned
The biggest lesson was that autonomy requires failure handling more than it requires success handling. Any pipeline can execute a happy path. What makes a system truly autonomous is what it does when something goes wrong — does it fail silently, or does it reason about the failure and try something different?
Every architectural decision in URIS was made in service of that question. The structured JSON agent protocol, the confidence scores, the rejection mechanism, the audit log — all of it exists so that when the system fails, it fails transparently and recovers intelligently.
Amazon Nova 2 Lite made the reasoning layer possible. Without a model capable of genuine multi-step reasoning under constraints, the agents would just be glorified functions. With it, they make judgment calls.
What's Next
The closed-loop validation is working. The compliance engine is working. The ADFI scoring is working. What comes next is making the loop smarter — better synthesis strategies, faster agent communication, and opening the pipeline to team-level use so that the same autonomous intelligence one data scientist uses today becomes shared infrastructure for an entire organization.
Data preparation shouldn't be the bottleneck between humans and AI. URIS removes it.
Built With
- amazon-nova
- nestjs
- nextjs
- postgresql
- supabase
Log in or sign up for Devpost to join the conversation.