Inspiration
Shinhan Bank Vietnam holds three rare, compounding advantages:
- 2 million live SOL-app users — distribution most fintechs would kill for.
- A direct consumer-lending licence — zero dependency on third-party originators.
- The cheapest cost of funds in market (3–5% via deposits) — a structural margin moat.
And yet — Shinhan has no embedded BNPL at merchant checkout, no AI-native money coach in the SOL app, and no working-capital product for SMEs. Meanwhile MoMo, Fundiin, Atome, and Kredivo are quietly eating a $2.7B Vietnamese BNPL market projected to hit $7.1B by 2031 (CAGR 21.4%); 5 million Vietnamese SMEs sit on rich KiotViet / Sapo POS data but cannot get a working-capital loan because they lack audited statements; and the SOL app's engagement gap versus super-apps keeps widening.
We saw the wedge: one Qwen-powered AI platform that turns Shinhan's distribution + capital advantage into three new revenue lines — without three engineering teams, three ledgers, or three compliance regimes. Built on Alibaba Cloud and Qwen3 from day one, so the same agent runtime, same knowledge graph, and same Circular 39 audit pipe serve consumer BNPL today, AI coaching tomorrow, and SME credit in Year 2.
That is Shinhan SOL Intelligence.
What it does
The platform exposes 12 live use cases organised into 4 strategic pillars. Every use case is wired to the same Postgres ledger, the same Apache AGE knowledge graph, the same SSE streaming bus, and the same Qwen model gateway.
| # | Code | Use case | Pillar | Primary Qwen model | Key metric |
|---|---|---|---|---|---|
| 1 | SB6 | 6-second embedded BNPL at merchant checkout | Acquisition | Qwen3-VL-Plus + Qwen3-Max-Thinking | <8s decision · 3 revenue streams (MDR + origination + interest) |
| 2 | SB1 | Agentic AI money coach in SOL app | Retention | Qwen3-Max + Qwen-Plus | 78th-percentile peer benchmark in <2s · 60/40 proactive/reactive |
| 3 | SB5 | Agentic BI co-pilot for executives | Intelligence | Qwen-Plus + Qwen-Flash + text-embedding-v3 | 0.806 relevance · regulation-cited answers |
| 4 | SB2 | 11-journey customer-lifecycle engine | Engagement | Qwen-Plus | 44 content variants · 11 DAGs · cross-sell uplift |
| 5 | SB9 | SME pre-approved credit from POS data | SME | Qwen3-Max | 50–500M VND lines · daily-sales auto-sweep |
| 6 | SB3 | Loyalty + dynamic single-use QR offers | Engagement | Qwen-Plus + text-embedding-v3 | Thompson-sampling bandit · per-segment offer ranking |
| 7 | SF1+SF2 | Loan IDP + 5-signal forgery detection | Compliance | Qwen3-VL-Plus | 0.91 authenticity · <3s per doc |
| 8 | SF4 | Next-Best-Action CRM (RFM × KG × sector) | Retention | Qwen3-Max | 0.80 top-NBA score · graph-grounded rationale |
| 9 | SF11 | Earned Wage Access (EWA) | SME / HR | Qwen3-Max | Up to 50% earned-but-unpaid drawdown · self-amortising |
| 10 | SS5 | RegTech compliance scanner | Compliance | text-embedding-v3 + Qwen3-Max | Every alert cites the exact Circular 39 article |
| 11 | Meta | Cross-box trace replay (one customer × four pillars × 30 days) | Ops | Qwen-Plus | Full audit trail per customer · colour-coded by pillar |
| 12 | BNPL-SDK | Drop-in JavaScript checkout SDK + demo merchant | Acquisition | (consumes SB6) | HMAC-signed webhooks · idempotent disbursement |
The four pillars compound:
┌─────────────────────────────┐
│ Banking Knowledge Graph │
│ (Apache AGE on Postgres) │
└──────┬───────────┬──────────┘
│ │
┌───────────────┘ └───────────────┐
▼ ▼
ACQUISITION INTELLIGENCE
┌──────────┐ ┌──────────┐
│ SB6 │ │ SB5 │
│ BNPL │ ──── customer events ────────▶ │ Agentic │
│ SF1/SF2 │ │ BI │
│ BNPL-SDK│ │ SS5 │
└────┬─────┘ └────┬─────┘
│ creates customers + loans │ feeds insights back
▼ ▼
RETENTION ENGAGEMENT
┌──────────┐ ┌──────────┐
│ SB1 │ │ SB2 │
│ Coach │ ──── intents trigger ─────────▶ │ Journeys│
│ SF4 NBA │ │ SB3 QR │
│ SF11 EWA│ │ SF11 │
└──────────┘ └──────────┘
How we built it
Three sub-sections: (A) Qwen application & effectiveness, (B) Alibaba Cloud architecture, (C) Knowledge graph topology.
(A) Qwen application — which model, where, and why
We use five Qwen models with strict, cost-aware routing. No single Qwen model is best for everything; the platform routes per task.
| Model | Endpoint | Where it runs | Why this model | Approx cost |
|---|---|---|---|---|
| Qwen3-VL-Plus | DashScope SG | CCCD OCR · payslip extraction · selfie-to-CCCD face match · invoice IDP · forgery scoring (5 signals in 1 call) | Vietnamese diacritic OCR ~6% CER — beats GPT-4V & Gemini on Vietnamese CCCD; native multimodal scoring | ~$0.015 / 1K tok |
| Qwen3-Max-Thinking | DashScope SG | Credit decisioning rationale · NBA explanation · EWA reasoning · SB1 deep coaching turns | Strongest reasoning + thinking trace for adverse-action explanations regulators can audit | Premium reasoning tier |
| Qwen-Plus | DashScope SG | BI planner · BI synthesiser · NL → Cypher · NL → SQL · SB1 small-talk · SB2 copy variants · SB3 offer copy | 0.6s p50 latency, ~$0.004/turn — sweet spot for high-volume reasoning where ultra-frontier isn't required | ~$0.004 / 1K tok |
| Qwen-Flash | DashScope SG | BI intent classification · cheap routing · structured-extract micro-tasks | Sub-300ms cold turn; we use it before any expensive reasoning to gate traffic | Lowest tier |
| Qwen text-embedding-v3 | DashScope SG | pgvector RAG over SBV regulations · transaction semantic search · NBA peer similarity · SS5 regulation lookup | 1024-dim, multilingual; IVFFlat cosine indexed in Postgres | ~$0.0002 / 1K tok |
All Qwen calls go through one OpenAI-compatible client at backend/app/alibaba/dashscope.py:
client = AsyncOpenAI(
api_key=settings.dashscope_api_key,
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
timeout=30.0,
max_retries=0, # we own retries via tenacity
)
Why a hybrid design (Qwen + hard rules) — not pure LLM scoring. Pure-LLM credit decisioning is non-shippable in a regulated lending product. Our LangGraph topology enforces this contractually:
┌─────────────────┐ no ┌──────────┐
start ──▶│ extract_cccd │─────────▶│ decide │ (reject if extraction fails)
└────────┬────────┘ │ (final) │
│ ok └──────────┘
▼ ▲
┌─────────────────┐ fail │
│ face_match │───────────────┤ (reject if similarity < 0.75)
└────────┬────────┘ │
│ ok │
▼ │
┌─────────────────┐ │
│ extract_payslip │ │
└────────┬────────┘ │
▼ │
┌─────────────────┐ (parallel) │
│ fetch_signals │ CIC + telco │
└────────┬────────┘ │
▼ │
┌─────────────────┐ DTI > 50% │
│ calculate_dti │───────────────┤
└────────┬────────┘ │
▼ │
┌─────────────────┐ risk>0.80 │
│ run_fraud_check │───────────────┤
└────────┬────────┘ │
▼ │
┌─────────────────────┐ │
│ decide_credit_limit │ ──────────┘
│ (Qwen3-Max-Thinking)│
└─────────────────────┘
Hard rules — enforced in Python, never in the prompt:
- $\text{age} \in [18, 65]$
- $\text{face_match} \geq 0.75$
- $\text{fraud_risk} \leq 0.80$
- $\text{DTI} \leq 0.50$ where $\text{DTI} = \dfrac{\text{existing_monthly_obligation} + \text{new_monthly_payment}}{\text{monthly_income}}$
- No NPL history (SBV groups 3–5)
Tier assignment (deterministic; the LLM cannot move you up a tier — only down with reason):
$$ \text{tier} = \begin{cases} \text{prime} & \text{if } \text{cic} \geq 720 \land \text{DTI} < 0.30 \quad\text{(0\% APR, cap 80M VND)}\ \text{near-prime} & \text{if } 650 \leq \text{cic} < 720 \lor 0.30 \leq \text{DTI} < 0.40 \quad\text{(18\% APR, cap 50M)}\ \text{sub-prime} & \text{otherwise} \quad\text{(26\% APR, cap 10M)} \end{cases} $$
Effectiveness numbers we measured:
- Vietnamese OCR: Qwen3-VL-Plus ~6% CER on Vietnamese CCCD vs measurably worse for GPT-4V / Gemini.
- End-to-end credit decision latency: target p95 6.8 s; observed median 6.2 s.
- BI agent grounding lift: KG-grounded Cypher vs. naïve text-to-SQL — 3.4× accuracy lift (16.7% → 56.2%).
- Graphiti temporal-memory benchmark (applied to SB1): +18.5 pp accuracy, 90% latency reduction, 98.6% context compression vs full-conversation baseline.
- Cost per BNPL decision: ~$0.05 in Qwen tokens (CCCD + face + payslip + reasoning).
(B) Alibaba Cloud architecture — production-grade, 100% native
┌──────────────────────────────────────────────────┐
│ Cloudflare DNS + flexible SSL │
│ shinhansol.vsf.vin · api.shinhansol… │
└────────────────────────┬─────────────────────────┘
│ HTTPS
▼
┌──────────────────────────────────────────────────────────────┐
│ Alibaba Cloud · ap-southeast-1 (Singapore) │
│ │
│ ┌────────────────────┐ ┌──────────────────────┐ │
│ │ ECS (Docker host) │ │ Model Studio │ │
│ │ ┌────────────────┐ │ HTTPS │ (DashScope SG) │ │
│ │ │ Next.js 15 │◀┼────────┤ Qwen3-VL-Plus │ │
│ │ │ (frontend) │ │ │ Qwen3-Max-Thinking │ │
│ │ └────────┬───────┘ │ │ Qwen-Plus / Flash │ │
│ │ │ SSE │ │ text-embedding-v3 │ │
│ │ ┌────────▼───────┐ │ └──────────────────────┘ │
│ │ │ FastAPI │ │ │
│ │ │ + LangGraph │◀┼──── presigned PUT ───────┐ │
│ │ └────┬───────────┘ │ │ │
│ └──────┼─────────────┘ │ │
│ │ ┌───────────▼────────┐│
│ │ │ OSS ││
│ │ │ cccd/, selfie/, ││
│ │ │ payslip/ ││
│ │ │ KMS SSE · 5-min ││
│ │ │ presigned URL ││
│ ▼ │ · private ACL ││
│ ┌──────────────────────────┐ └────────────────────┘│
│ │ ApsaraDB RDS Postgres │ │
│ │ Serverless (0.5–14 RCU) │ │
│ │ VPC-only, private subnet │ │
│ │ Extensions: │ │
│ │ • Apache AGE │ ◀── Cypher (banking_kg) │
│ │ • pgvector (1024-dim) │ ◀── embedding RAG │
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ Redis │ idempotency · sessions · │
│ │ │ slowapi · SSE replay │
│ └──────────────────────────┘ │
│ ┌──────────────────────────┐ │
│ │ ACR · SLS · ActionTrail │ CI/CD · logs · audit · │
│ │ Cloud Monitor · KMS │ metrics · CMK encryption │
│ └──────────────────────────┘ │
│ │
│ (PoC roadmap, not in MVP) PAI-EAS — XGBoost fraud model │
└──────────────────────────────────────────────────────────────┘
| Layer | Service | What we use it for | Where it's wired |
|---|---|---|---|
| AI gateway | Model Studio (DashScope, SG) | Every Qwen call, OpenAI-compatible | backend/app/alibaba/dashscope.py · vision.py · reasoning.py |
| Compute | ECS (Singapore) | Single-host docker-compose for MVP; horizontally-scalable design | .github/workflows/deploy.yml · infra/docker-compose.yml |
| Registry | ACR | Image push from CI; pull on ECS | CI workflow + IMAGE_REGISTRY env |
| Data | ApsaraDB RDS PostgreSQL Serverless | Banking ledger, AGE graph, pgvector embeddings | app/db/* and app/graph/* |
| Storage | OSS | CCCD / selfie / payslip images, presigned PUT direct from browser | app/services/storage_service.py |
| Encryption | KMS | Customer-managed key for OSS SSE-KMS | OSS bucket policy |
| Cache | Redis | Idempotency keys, SSE replay buffer, slowapi limiter | app/services/redis_client.py |
| Logging | SLS | Structured app logs (request_id, agent_node, latency_ms, cost_usd) | app/services/logging_service.py |
| Audit | ActionTrail | API call audit per user / per endpoint | Account-level |
| Metrics | Cloud Monitor | RDS, ECS, OSS metrics + alerts | Dashboards |
| Edge | Cloudflare | Flexible SSL, DNS, DDoS shielding | Cloudflare zone |
| ML inference (roadmap) | PAI-EAS | XGBoost fraud model — replaces rule stub at PoC W7 | Planned |
Security & compliance posture:
- PII at rest: OSS objects encrypted with KMS-managed CMK (SSE-KMS); private ACL; 5-min presigned URLs only.
- PII in transit: HTTPS everywhere; SSE frames pass through a PII scrubber in
app/services/astream_mapper.pybefore leaving the process. - PII in DB: CCCD, phone, email tokenised in BI views; column-level role gating.
- Idempotency: Redis-backed; second POST/GET on same
Idempotency-Keyreplays the cached response — no double lending. - Audit: every loan write produces (a) double-entry ledger row, (b)
agent_tracesjsonb row with full tool sequence, (c) ActionTrail entry, (d) Circular 39 mapped row available via CSV export. - Compliance frameworks mapped: SBV Circular 39/2016/TT-NHNN (lending), Decree 13/2023 (PDPL), Decree 94/2023 (sandbox).
(C) Knowledge graph topology — the moat
Vanilla text-to-SQL hits 10–20% accuracy in production. Our agentic BI grounds queries in a banking knowledge graph so "why" questions become traversal problems, not aggregation guesses.
Schema — 12 vertex labels, 16 edge labels (10 currently seeded):
| Vertex | Key properties |
|---|---|
| Customer | id, name, risk_tier |
| Loan | id, principal, status, dpd, outstanding |
| Merchant | id, name, category |
| Sector | id, code (VSIC), name, npl_rate, risk_level |
| Region | id, code, name, urban_rural (63 VN provinces) |
| Employer | id, name, employee_count |
| Regulation | id, name, circular_number, issuer |
| RiskEvent | id, name, severity, occurred_at |
| LoanCategory | id, name, group_number, provision_rate (SBV groups 1–5) |
| CreditScore | id, score, tier, source, valid_from, valid_to |
| ComplianceRule | id, type, description, loan_group |
| MerchantCategory | id, name |
| Edge | Direction | Cardinality |
|---|---|---|
| HAS_LOAN | Customer → Loan | 1 : N |
| WORKS_AT | Customer → Employer | N : 1 |
| LIVES_IN | Customer → Region | N : 1 |
| IN_SECTOR | Employer → Sector | N : 1 |
| LOCATED_IN | Employer → Region | N : 1 |
| ORIGINATED_AT | Loan → Merchant | N : 1 |
| CLASSIFIED_AS | Loan → LoanCategory | N : 1 |
| AFFECTS | RiskEvent → Sector | N : M |
| REQUIRES | Regulation → ComplianceRule | 1 : N |
| APPLIES_TO | ComplianceRule → LoanCategory | N : M |
Runtime query pipeline (NL → Cypher → answer):
"Why did manufacturing NPL increase in Q3?"
│
▼
classify_intent_node ── Qwen-Flash ──▶ intent = "causal"
│
▼
plan_tasks_node ── Qwen-Plus ──▶ DAG: [T1 sql_query, T2 kg_traverse, T3 chart]
│
▼
execute_tools_node ──▶ T2: kg_traverse_tool
│
▼
Qwen-Plus generates Cypher with KG_SCHEMA_SUMMARY in context
│
▼
validate_and_sanitize_cypher()
• forbid CREATE/MERGE/DELETE/SET/DROP/CALL db./LOAD CSV
• inject LIMIT (max_rows = 10 000)
• cap hops (bi_max_hops = 4)
│
▼
age_client.cypher_query(graph='banking_kg', cypher=...)
SQL: SELECT * FROM cypher('banking_kg', $$ ... $$) AS (col agtype, ...)
│
▼
_rows_to_graph() → {nodes, edges, causal_chain}
│
▼
synthesize_node ── Qwen-Plus ──▶ Markdown narrative + citations + chart configs
Topology of the headline scenario — Manufacturing NPL Cascade:
Regulation(Circular 39/2016/TT-NHNN)
│ REQUIRES
▼
ComplianceRule(provisioning, group=3, rate=0.50)
│ APPLIES_TO
▼
LoanCategory(group_number=3, "Sub-standard")
▲
│ CLASSIFIED_AS
│
Loan(status=defaulted, dpd≥90, outstanding=14.1B VND)
▲ │
│ HAS_LOAN │ ORIGINATED_AT
│ ▼
Customer ──WORKS_AT──▶ Employer ──IN_SECTOR──▶ Sector(VSIC "C", Manufacturing, npl_rate=0.068)
│ LOCATED_IN ▲
▼ │ AFFECTS
Region(Binh Duong) RiskEvent("Supply Chain Disruption", 2024-07-01)
Pre-planted discovery scenarios (each verified by plant_discoveries() at seed time):
- Manufacturing NPL cascade — 142 NPL loans, 97 in Manufacturing, 68 in Binh Duong; 12.3B VND exposure traced through 4 hops to a single Risk Event.
- Merchant quality divergence — Electronics merchants 2.1% NPL vs Fashion 8.7% in HCMC; correlates with customer income segment.
- Regulatory provisioning gap — Circular 39 requires 50% provision for Group 3; current pool sits at 20%, generating a 4.23B VND shortfall computed end-to-end on the graph.
Why this is hard to fake:
- Apache AGE on Postgres means the graph lives in the same RDS as the ledger — same backup, same VPC, same audit trail. No second datastore to break.
- pgvector alongside AGE means semantic regulation lookup and graph traversal answer the same question in one transaction.
- Qwen-Plus writes the Cypher; Python validates it; AGE runs it; Qwen-Plus narrates the result. No step trusts the next blindly.
Challenges we ran into
- Vietnamese OCR. GPT-4V and Gemini both choke on Vietnamese diacritics in CCCD photos. Qwen3-VL-Plus delivered ~6% CER — the unlock that made the entire BNPL flow commercially viable.
- Letting an LLM near a credit decision. Pure-LLM scoring is non-shippable for a regulated lender. We solved it with a hybrid guardrail topology where Python owns every hard rule and the Qwen reasoning is checked for contradictions before being persisted as the rationale.
- Text-to-SQL hallucination at scale. Vanilla NL→SQL on a 20+ table ledger sat at ~10–20% accuracy in our tests. Routing causal questions through Apache AGE pushed accuracy to ~56% — a 3.4× lift on the same questions.
- Real-time UX over a multi-step pipeline. Customers abandon if approval feels like a black box. We built
astream_events v2→ SSE so every tool start/end and every Qwen reasoning chunk hits the browser live, with full replay from Redis on reconnect. - One developer, 12 use cases, one week. We collapsed three would-be apps into one platform — shared agent runtime, shared ledger, shared knowledge graph, shared compliance pipe — so SB1 / SB5 / SB9 became configuration + new tools on top of the SB6 spine.
- Cost discipline. Per-tool
ToolCallMetacost tracking; cheap intents routed to Qwen-Flash so only ambiguous reasoning hits Qwen3-Max-Thinking.
Accomplishments that we're proud of
- A live, production-grade demo — not a hardcoded mockup. Judges can upload a real Vietnamese CCCD at https://shinhansol.vsf.vin and see real Qwen3-VL tokens spent in the Alibaba Console.
- <8 s end-to-end credit decision with real OCR, real face match, real DTI math, real ledger write, and real Circular 39 audit row — every single time.
- One unified platform serving 12 use cases across 4 pillars on shared infra. Year-1 plan: 100K BNPL transactions, 1,000 tỷ VND GMV, target NPL <3%.
- A real banking knowledge graph in Apache AGE — 12 vertex labels, 10+ edge labels, three fully-traversable causal scenarios pre-planted and verified at seed time.
- 3.4× agentic-BI accuracy lift vs naïve text-to-SQL on the same questions, by grounding queries in the graph.
- Radical "real vs mocked" transparency. Every component labelled. Real today: Qwen3-VL OCR + face match + payslip, LangGraph agent, RDS double-entry ledger, OSS+KMS, Circular 39 export, Apache AGE traversal, SSE streaming. Mocked with PoC-week placeholders: CIC bureau, telco, NAPAS disbursement, fraud ML model.
- 100% Alibaba Cloud native — Model Studio + RDS + OSS + KMS + SLS + ACR + ECS + ActionTrail + Cloud Monitor in
ap-southeast-1. A genuine end-to-end Qwen + Alibaba showcase, not a multi-cloud collage.
What we learned
- Hard rules + LLM reasoning beats either alone in regulated domains. Determinism owns compliance; the LLM owns explanation.
- Vietnamese-language AI is now a moat, not a constraint. Qwen3-VL on Vietnamese CCCD beats Western frontier models — that flips the build-vs-buy maths for VN fintech.
- Knowledge graphs save text-to-SQL. Grounding the agent in entity relationships (Apache AGE) raised analytical accuracy more than any prompt-engineering trick.
- Streaming is product, not telemetry. The trace panel where users watch the agent work raised trust in user-testing more than any landing-page copy.
- One platform, four pillars is dramatically cheaper than four apps when the agent runtime, ledger, and compliance fabric are shared from day one.
- Cost-aware Qwen routing — Flash for intent, Plus for everything moderate, Max-Thinking only when explanation must withstand audit — keeps unit economics shippable.
What's next for Shinhan SOL Intelligence
12-week Shinhan InnoBoost PoC roadmap (target 200–300M VND grant + commercial partnership):
- W1–W3 — Pilot integration with one tier-1 merchant (TGDD / Nguyen Kim / Long Chau).
- W4 — Replace Graphiti mock with real temporal-memory backend for SB1 coach.
- W4–W6 — Real CIC bureau API + Shinhan core-banking disbursement (NAPAS).
- W7 — XGBoost fraud model on Alibaba PAI-EAS replacing the rule stub.
- W8–W9 — Live telco-data enrichment + behavioural signals.
- W10 — Auto-debit repayment via Shinhan core.
- W11–W12 — Security review, audit, GA readiness.
- Year 1 GA — 1–3 merchant partners, 100K transactions, 1,000 tỷ VND GMV, blended margin ~3% on volume.
- Year 2–3 — Expand SB1 coach to 300K DAU; launch SB9 SME credit with KiotViet / Sapo (target: 5K SMEs, 2,000 tỷ VND disbursed by Year 3).
Built With
- alembic
- alibaba-cloud-acr
- alibaba-cloud-actiontrail
- alibaba-cloud-cloud-monitor
- alibaba-cloud-ecs
- alibaba-cloud-kms
- alibaba-cloud-model-studio
- alibaba-cloud-oss
- alibaba-cloud-sls
- apache-age
- cloudflare
- cypher
- dashscope
- docker
- docker-compose
- fastapi
- github-actions
- langgraph
- nextjs
- pgvector
- postgresql
- python
- qwen-flash
- qwen-plus
- qwen-text-embedding-v3
- qwen3-max-thinking
- qwen3-vl-plus
- radix-ui
- react
- redis
- server-sent-events
- sql
- sqlalchemy
- tailwindcss
- typescript

Log in or sign up for Devpost to join the conversation.