SOLmate: Empowering SOHOs through Real-time AI Credit Intelligence

Inspiration

In Vietnam, Small Office/Home Office (SOHO) businesses are the lifeblood of the retail economy. However, they face significant hurdles in accessing formal credit due to a lack of collateral and cumbersome traditional appraisal processes, which typically take 3 to 5 days. From a bank's perspective, such as Shinhan Bank, the absence of transparent data makes lending to this segment inherently risky.

We believe that "Transaction data is the best collateral." This inspired us to build SOLmate—a financial ecosystem that unlocks capital for SOHOs by transforming Point-of-Sale (POS) data into digital creditworthiness in real-time.

Real-life Problem (Market Context) Vietnam’s SME and SOHO sector, under the regulatory scope of the State Bank of Vietnam (SBV), plays a critical role in driving retail economic activity and employment. However, access to formal credit remains a major bottleneck. Most small merchants lack audited financial statements, standardized accounting systems, or sufficient collateral, placing them in the “thin-file” segment.

Traditional lending models rely on static, backward-looking financial data, resulting in:

Long approval cycles (typically 3–5 days) High operational costs for banks Inaccurate risk assessment due to lack of real-time visibility

This leads to a structural inefficiency where many high-cashflow businesses are underserved, while banks face elevated uncertainty when lending to this segment.

Corporate Context ([SB9] AI-Driven SME Credit Scoring via POS Data) In alignment with the strategic direction of Shinhan Bank, the [SB9] use case focuses on leveraging POS and payment transaction data to transform SME credit assessment.

The solution connects banks with POS systems to collect real-time merchant sales data and applies AI models to evaluate business performance dynamically. Based on this, the system generates pre-approved credit limits without requiring traditional financial statements.

Key stakeholders involved include:

SOHO Department Retail Lending Department Risk Management Division Digital Business Unit (Digital Corporate Cell)

Expected business outcomes:

Increase SME loan approval rate, especially for thin-file customers Reduce credit assessment turnaround time from days to near real-time Improve risk accuracy using real transaction data instead of static financials Expand SME lending portfolio and market share Unlock new customer segments (micro & informal businesses)

What it does

SOLmate is a digital credit scoring and instant loan approval platform. It operates on a highly resilient dual AI-engine architecture:

  • Quantitative Risk Assessment (XGBoost via ONNX):

This pipeline turns real-time POS transactions into a quantitative credit decision. It uses 12 engineered features to predict an objective CIC credit score (150-750) with XGBoost, then converts that score into a risk label and Probability of Default (PD).

  1. Revenue_mean_30d: Short-term average revenue (latest 30 days).
  2. Revenue_mean_90d: Baseline average revenue (latest 90 days).
  3. Txn_frequency: Average daily transaction count.
  4. Growth_value: Revenue growth ratio = (30d - 90d) / 90d, clipped to [-1, 1].
  5. Growth_score: Normalized growth in [0, 1].
  6. CV_value: Revenue volatility proxy (coefficient of variation).
  7. CV_score: Stability score (higher CV -> lower score).
  8. Spike_ratio: Dependence on abnormal revenue spikes (p95/median by default).
  9. Spike_score: Smoothness score (more spikes -> lower score).
  10. Txn_freq_score: Log-normalized transaction intensity.
  11. Years_score: Business maturity score from years_in_business.
  12. Industry_score: Structural industry risk score from industry-risk mapping.

End-to-End Process

  1. Ingest raw POS transaction logs per business.
  2. Aggregate daily revenue and transactions over rolling 90-day context.
  3. Compute behavior metrics (growth, volatility, spikes, activity).
  4. Build the 12-feature vector for each business.
  5. Predict CIC score with XGBoost regression.
  6. Map CIC score to label
  7. Compute PD from label-based base risk plus feature-score adjustments.

How PD Is Calculated

  • Start with base PD by label (higher risk label -> higher base PD).
  • Apply adjustment from key behavior scores: Growth_score, CV_score, Spike_score, Txn_freq_score, Years_score.
  • Use nonlinear scaling and clipping to keep PD in [0.01, 0.99].

Why This Works for Credit Risk

  • Data-driven: score is learned from transaction behavior, not manual-only rules.
  • Multi-dimensional risk view: combines scale, trend, stability, abnormality, maturity, and industry context.
  • Actionable output: delivers both a credit score and PD for lending decisions.

  • Qualitative Risk Evaluation (Qwen LLM Agent): Acts as an AI Underwriter. It ingests the quantitative metrics and contextual data to generate comprehensive, natural language risk reports and lending recommendations. Cognitive Underwriting Vision SMEs are the backbone of the economy but remain underserved due to the lack of standardized financial data. The AI Underwriter transforms real-time POS transactions into a dynamic credit profile, replacing static financial statements. This enables instant, data-driven pre-approvals, increases loan approval rates, and unlocks micro and informal business segments.

Microservices & Data Intelligence Architecture Built on a scalable microservices architecture, the system ingests data via real-time APIs and batch interfaces. It applies a multi-source RAG approach to validate enterprise data against risk patterns, followed by a quantitative screening layer to ensure all insights are grounded in hard metrics like cash flow and transaction behavior.

Cognitive Risk Engine & Decision Logic The system incorporates critical reasoning to detect anomalies (e.g., abnormal revenue spikes) and simulate stress scenarios. It enforces strict policy compliance by flagging risks and escalating to human review when necessary, ensuring robust and trustworthy credit decisions.

Dual-Report Output & Strategic Impact A dual-report system generates both internal credit recommendations and external advisory insights for SMEs. While it enhances transparency and real-time decision-making, challenges include dependence on POS data integrity, sensitivity to black swan events, and integration with legacy banking systems. The result: SOHOs receive credit limits in seconds, while banks gain a centralized dashboard for Risk, SOHO and Retail departments to make informed, transparent lending decisions.

How we built it

To handle high concurrency and provide sub-second responses, we implemented a robust Polyglot Microservices architecture:

  • Core API & Worker (Go): Handles high-speed event streaming, API orchestration and fallback mechanisms for extreme resilience.
  • Event-Driven Data Pipeline (Kafka & Redis): Instead of querying heavy PostgreSQL logs, we use Apache Kafka to continuously aggregate transaction features into Redis. This ensures an $O(1)$ lookup time (<1ms) for the scoring API.
  • Embedded AI Inference (ONNX Runtime): The XGBoost model is optimized into the .onnx format and embedded directly into the Go backend using C++ bindings, eliminating external service latency and preventing network failures.
  • AI Agent Service (Python FastAPI & Qwen): A specialized service utilizing the powerful Qwen LLM. It acts as the "brain" that translates raw scores into actionable financial insights and text reports.
  • Real-time Communication: Results are pushed from the backend to the frontend via Server-Sent Events (SSE) and Redis Pub/Sub for a seamless, live-updating user experience.

Challenges we ran into

The biggest challenge was the scarcity of B2B training data. Publicly available datasets are mostly consumer-based (B2C) and do not reflect the specific risk profiles of businesses.

Solution: We developed a Synthetic Data Generation engine to simulate distinct business personas (e.g., Micro F&B, Medium Retailers and Fraud-prone merchants). This allowed our XGBoost model to learn critical correlations—like the relationship between high refund rates and loan default risk—without compromising real-world privacy.

Accomplishments that we're proud of

  • Hybrid AI Pipeline: Successfully integrated traditional Machine Learning (XGBoost) for high-speed quantitative scoring with Generative AI (Qwen) for qualitative reasoning within a single Hackathon sprint.
  • High-Resilience Architecture: Designed a fault-tolerant Go backend that gracefully handles AI or Database component failures via smart fallbacks, ensuring the API port never crashes under pressure.

What we learned

  • Contract-First Development: Decoupling data preparation (Go) from mathematical transformation (ONNX) and text generation (Python) allowed our backend and data science teams to work in parallel without bottlenecks.
  • Financial Prompt Engineering: We learned how to carefully structure the prompt context (mapping specific financial metrics into natural language) for Qwen to prevent "hallucinations" when dealing with sensitive risk assessment data.

What's next for SOLmate

  • Open API Integration: Connecting directly with major Vietnamese POS platforms (e.g., KiotViet, Sapo) and Shinhan Bank’s card terminal systems for live data ingestion.
  • Enhanced eKYC: Upgrading the identity verification module with biometric authentication and automated business license verification.

Built With

Share this project:

Updates