## Inspiration
Every retailer in Vietnam sits on the same goldmine and the same problem: years of sales, customer, and inventory data — and no one with the time to turn it into a decision by Monday morning. Hiring a data team isn't realistic for most SMEs and mid-market enterprises, so the data just sits there and decisions get made on gut feel.
We asked a sharper question: what if the data engineer was an agent? Not a chatbot that answers questions, but an agent that ingests raw business data, cleans it, reasons over it, and hands a store manager a ranked list of what to do this week — in plain Vietnamese, with the money impact attached.
## What it does
Kaori Retail Agent turns a retailer's raw sales data into prioritized, explainable next-best-actions — no data engineer required.
- Upload anything — messy Excel/CSV of sales, customers, transactions.
- Auto-pipeline (12-stage Medallion: Bronze → Silver → Gold) — schema detection, cleaning, Vietnamese-aware PII redaction, a 7-dimension quality gate, and semantic enrichment. Bad data is flagged, not silently trusted.
- Agentic reasoning — surfaces revenue at risk (which customers are churning, which SKUs are bleeding margin) and generates next-best-actions ranked by money impact, each with an explanation and an audit trail.
- Decide in manager language — outputs speak Vietnamese business terms ("doanh thu có nguy cơ mất", "khách cần giữ chân"), not "inference" and "dtype".
The North Star is deliberately blunt: revenue at risk that a human actually actioned.
## How we built it
Kaori is a production multi-tenant B2B platform, not a weekend prototype:
- 6 services — Java Spring Cloud Gateway + Auth, and Python FastAPI for the data pipeline, AI orchestrator, LLM gateway, and notifications.
- Local-first LLM — Qwen 2.5 14B + BGE-M3 embeddings run on our own infra via Ollama by default, so customer data never leaves the system. External vendors are strictly opt-in, gated behind consent + PII masking.
- CDFL reasoning engine ("học 1 hiểu 10" — learn one, understand ten) — a grounding
layer with an
|OR|coverage gate: if the agent lacks enough grounded knowledge to answer, it declines instead of hallucinating. - Memory palace + knowledge aging — the agent consolidates experience, reinforces what's verified, and decays what's stale.
- PostgreSQL 15 + pgvector with row-level-security tenant isolation, plus Redis, Kafka, Temporal, MinIO, ClickHouse, and OpenTelemetry tracing throughout.
## Challenges we ran into
- Grounding without hallucination. Our first coverage gate let the quantity of weak
matches compensate for quality — so we switched the
|OR|gate to max-aggregation: one strong grounded citation now beats ten weak ones. - Running an LLM locally, fast enough. Keeping inference inside a 30s budget meant bounding LLM calls in the request path and degrading gracefully per-item instead of failing the whole run.
- Vietnamese-aware privacy. Correctly redacting names, phones, and IDs in Vietnamese before any reasoning step.
- Multi-tenant isolation as an invariant, not a hope. Making "zero cross-tenant leak" something we test on every query, not something we trust.
## Accomplishments that we're proud of
- It's built to be deployed, not just demoed. Because this event is about deployment conversations, we built the governance an enterprise actually needs to say yes: EU AI Act compliance built in — risk classification per AI-use, human-oversight gates before high-risk side effects, machine-readable AI-output disclosure, Annex IV model cards, an incident register, and bias examination inside the quality gate.
- Every automated decision is auditable — confidence, alternatives, and lineage are logged, so a manager can always ask "why did the AI say this?"
- A real working platform — multi-tenant, privacy-first, with thousands of automated tests across services and a multi-language frontend (vi / en / ja / ko / zh).
- An agent that knows when to say "I don't know" — the discipline to decline turned out to be the hardest and most valuable thing we shipped.
## What we learned
Teaching an agent a concept and letting it generalize beats hardcoding rules — a single "money" principle let it reason across cases we never explicitly coded. And the gap between an impressive demo and a deployable system is almost entirely trust: isolation, auditability, and the discipline to decline. Production-readiness isn't a feature you add at the end — it's the thing you design around from the first commit.
## What's next for Kaori Retail Agent — Decisions from Your Sales Data
- A guided pilot with a Vietnamese retailer (the Retail track brief).
- Self-hosted LLM tuning for Vietnamese retail vocabulary.
- Deeper process-mining and adoption analytics to close the loop from decision to measured outcome.
- Rolling out the full multilingual UI (i18n already in place across 5 languages) for regional ASEAN expansion.
Built With
- docker
- fastapi
- java
- kafka
- nextjs
- ollama
- progrest
- python
- springboot
- sql
- typescript
Log in or sign up for Devpost to join the conversation.