## Inspiration

Every retailer in Vietnam sits on the same goldmine and the same problem: years of sales, customer, and inventory data — and no one with the time to turn it into a decision by Monday morning. Hiring a data team isn't realistic for most SMEs and mid-market enterprises, so the data just sits there and decisions get made on gut feel.

We asked a sharper question: what if the data engineer was an agent? Not a chatbot that answers questions, but an agent that ingests raw business data, cleans it, reasons over it, and hands a store manager a ranked list of what to do this week — in plain Vietnamese, with the money impact attached.

## What it does

Kaori Retail Agent turns a retailer's raw sales data into prioritized, explainable next-best-actions — no data engineer required.

  1. Upload anything — messy Excel/CSV of sales, customers, transactions.
  2. Auto-pipeline (12-stage Medallion: Bronze → Silver → Gold) — schema detection, cleaning, Vietnamese-aware PII redaction, a 7-dimension quality gate, and semantic enrichment. Bad data is flagged, not silently trusted.
  3. Agentic reasoning — surfaces revenue at risk (which customers are churning, which SKUs are bleeding margin) and generates next-best-actions ranked by money impact, each with an explanation and an audit trail.
  4. Decide in manager language — outputs speak Vietnamese business terms ("doanh thu có nguy cơ mất", "khách cần giữ chân"), not "inference" and "dtype".

The North Star is deliberately blunt: revenue at risk that a human actually actioned.

## How we built it

Kaori is a production multi-tenant B2B platform, not a weekend prototype:

  • 6 services — Java Spring Cloud Gateway + Auth, and Python FastAPI for the data pipeline, AI orchestrator, LLM gateway, and notifications.
  • Local-first LLM — Qwen 2.5 14B + BGE-M3 embeddings run on our own infra via Ollama by default, so customer data never leaves the system. External vendors are strictly opt-in, gated behind consent + PII masking.
  • CDFL reasoning engine ("học 1 hiểu 10" — learn one, understand ten) — a grounding layer with an |OR| coverage gate: if the agent lacks enough grounded knowledge to answer, it declines instead of hallucinating.
  • Memory palace + knowledge aging — the agent consolidates experience, reinforces what's verified, and decays what's stale.
  • PostgreSQL 15 + pgvector with row-level-security tenant isolation, plus Redis, Kafka, Temporal, MinIO, ClickHouse, and OpenTelemetry tracing throughout.

## Challenges we ran into

  • Grounding without hallucination. Our first coverage gate let the quantity of weak matches compensate for quality — so we switched the |OR| gate to max-aggregation: one strong grounded citation now beats ten weak ones.
  • Running an LLM locally, fast enough. Keeping inference inside a 30s budget meant bounding LLM calls in the request path and degrading gracefully per-item instead of failing the whole run.
  • Vietnamese-aware privacy. Correctly redacting names, phones, and IDs in Vietnamese before any reasoning step.
  • Multi-tenant isolation as an invariant, not a hope. Making "zero cross-tenant leak" something we test on every query, not something we trust.

## Accomplishments that we're proud of

  • It's built to be deployed, not just demoed. Because this event is about deployment conversations, we built the governance an enterprise actually needs to say yes: EU AI Act compliance built in — risk classification per AI-use, human-oversight gates before high-risk side effects, machine-readable AI-output disclosure, Annex IV model cards, an incident register, and bias examination inside the quality gate.
  • Every automated decision is auditable — confidence, alternatives, and lineage are logged, so a manager can always ask "why did the AI say this?"
  • A real working platform — multi-tenant, privacy-first, with thousands of automated tests across services and a multi-language frontend (vi / en / ja / ko / zh).
  • An agent that knows when to say "I don't know" — the discipline to decline turned out to be the hardest and most valuable thing we shipped.

## What we learned

Teaching an agent a concept and letting it generalize beats hardcoding rules — a single "money" principle let it reason across cases we never explicitly coded. And the gap between an impressive demo and a deployable system is almost entirely trust: isolation, auditability, and the discipline to decline. Production-readiness isn't a feature you add at the end — it's the thing you design around from the first commit.

## What's next for Kaori Retail Agent — Decisions from Your Sales Data

  • A guided pilot with a Vietnamese retailer (the Retail track brief).
  • Self-hosted LLM tuning for Vietnamese retail vocabulary.
  • Deeper process-mining and adoption analytics to close the loop from decision to measured outcome.
  • Rolling out the full multilingual UI (i18n already in place across 5 languages) for regional ASEAN expansion.

Built With

Share this project:

Updates