Inspiration

We built SME Loan Approval Agent assisted by AI/ML/RL fraud detection at the Financial Agents Hackathon to solve a hard, real-world problem: small & medium enterprise (SME) lending is slow, opaque, and vulnerable to fraud — yet it's also a powerful lever for sustainable development. We wanted a product that:

  • speeds underwriting with AI,
  • catches tampered invoices visually and semantically,
  • provides a compliance copilot that explains decisions and creates tamper-proof audit trails, and
  • can be extended to drive sustainable finance outcomes (SDG-aware credit scoring, green portfolio recommendations, and inclusion-focused credit rails).

What it does

Untitled is an AI-powered SME loan approval assistant augmented with a Fraud & Compliance Copilot. Key capabilities:

  • Real-time SME health scoring (financial & non-financial signals).
  • Visual fraud detection on invoices (bounding boxes + explainability).
  • Semantic fraud detection across extracted fields (vendor mismatch, amount anomalies, duplicate/embedding checks).
  • Compliance Copilot (LLM + rules engine) that: summarizes risk, cites evidence and rules, recommends next steps, and generates reusable audit events.
  • Chaos Button to inject tampering scenarios for testing and model hardening.
  • Audit trail export (signed JSON + PDF embed) for regulator review and demonstrable chain-of-decision.
  • SDG & sustainability hooks: scoring extensions that incorporate SDG indicators (carbon intensity, social practices) to bias/augment credit risk and recommend green opportunities. ### SDG Integration

sdg_model.py links its scores directly to official SDG targets, making methodology transparent and defensible.

Current Metric Aligned SDG Refinement & Target Link
carbon_emissions_tons Goal 13: Climate Action Target 13.2: Integrated climate change measures into policies.
Action: Instead of a linear score, calculate a Climate Risk Modifier. High emissions relative to industry peers apply a negative weight to the final credit score.
supply_chain_score Goal 12: Responsible Consumption & Production Target 12.6: Encourage companies to adopt sustainable practices.
Action: Sub-metric on supplier diversity or circular economy practices (e.g., percentage of recycled material used).
diversity_index Goal 5: Gender Equality & Goal 10: Reduced Inequalities Target 5.5: Ensure full participation of women in leadership.
Action: Index specifically measures women in management and pay equity across demographics.

"Green Credit" Incentive Mechanism

Direct financial reward for sustainable behavior.

  1. Green Tiers:

    • Green Leader: SME's sdg_score > 75. Benefit: Reduce their final interest rate by 0.5% or increase their credit limit by 10%.
    • Improver: SME's sdg_score is low, but they complete a plan to improve (e.g., reducing emissions by 15% in 1 year). Benefit: Offer a "Green Development Loan" with favorable terms, conditional on meeting milestones.
    • Laggard: SME's sdg_score < 30 AND financial score is weak. Benefit: None. The system flags them for mandatory sustainability consultation before loan approval.
  2. "Green Credit Policy Agent" (green_policy_agent.py):

    • agent takes the credit_score and the sdg_score as inputs.
    • Its sole function is to apply the rules above and output a final_credit_offer (e.g., {"approved": true, "interest_rate": 4.2, "credit_limit": 150000, "green_bonus_applied": true}).
  3. UI Shows the Impact:

    • In the results panel, a clear callout: "🌿 Green Credit Bonus Applied! Your strong SDG performance (Score: 82) has reduced your proposed interest rate by 0.5%."

"SDG Portfolio Impact Score"

  • A financial institution builds a loan or investment portfolio of multiple green projects.

    • User selects 5-10 different SME applications from a list.
    • Your AI calculates not just the financial risk/return of the portfolio, but its combined SDG Impact Score.
    • Allows user to balance "Financial Return" vs. "SDG Impact."
  • Inclusion rails (extension path): non-traditional data ingestion (mobile payments, local economic signals) to service underserved borrowers.

How we built it

High-level components and flow:

  1. Frontend (React + Tailwind)
  • Document upload, invoice viewer with fraud overlay, Copilot chat panel, Chaos Button, audit export UI.
  1. API & Orchestration (FastAPI / Node)
  • Endpoints: /upload, /ocr, /fraud/visual, /fraud/semantic, /score, /copilot/explain, /audit/export.
  1. Processing & ML services
  • OCR → structured invoice JSON (Tesseract / Google Vision / LandingAI).
  • Visual Detector → bounding boxes + confidence (YOLOv8 or Detectron2).
  • Semantic Detector → anomaly models (LightGBM / Isolation Forest) using engineered features.
  • SME Health Scorer → ensemble model combining financial ratios, cashflow forecasts, bank feed patterns, and SDG indicators.
  • Compliance Engine → rule-driven checks (KYB/KYC, sanctions, documentation).
  • LLM Copilot → grounded explanations and action templates (system prompt + structured context).
  1. Storage & logs
  • Documents in S3, structured in Postgres, vector embeddings in Pinecone/Milvus, append-only audit entries (signed with KMS).
  1. Testing & Monitoring
  • Chaos automated tests, fraud detection metrics (precision/recall, mAP), model drift detectors.

SDG & Portfolio integration

  • Sustainable credit/risk analytics: the SME scorer takes traditional features plus SDG-linked indicators (carbon intensity estimates, sector sustainability score, vendor ESG profile) to produce a dual score: credit_score and sustainability_score. Underwriting decisions can be conditioned on both (e.g., prefer low-carbon vendors, apply green loan premium).
  • Portfolio Advisors: a connected module can recommend green portfolios to investors using SME pipelines and market signals, optimizing for returns + SDG impact.
  • Inclusion rails: alternate scoring pipeline ingests non-traditional signals (mobile payments, agent network activity) and outputs microloan suitability and tailored repayment plans.

Challenges we ran into

  • Noisy/heterogeneous documents: invoices come in many layouts and languages → required hybrid OCR + layout ML and heavy post-processing.
  • False positives in fraud detection: early detectors flagged benign anomalies (format differences) as fraud → needed hybrid visual+semantic ensemble plus human-in-the-loop labeling.
  • Explainability vs speed: detailed, regulatory-grade explanations take compute and data; balancing latency and richness of audit entries was hard.
  • Data privacy & PII: logs and audits must be tamper-proof but privacy-preserving; we had to design selective redaction and secure signing.
  • Grounding LLMs reliably: preventing hallucinations required strict prompt design and passing only structured facts to the LLM.

Accomplishments that we're proud of

  • Produced a working end-to-end demo in 8 hours: upload → OCR → fraud overlay → Copilot explanation → audit JSON/PDF export.
  • Implemented a Chaos Button that automatically generates 6 tampering scenarios (swapped totals, forged signature, removed line items) and visualizes detection sensitivity.
  • Built an append-only signed audit format (JSON + HMAC signature) that exports to a PDF with embedded JSON for regulators.
  • Prototyped SDG-aware scoring hooks that demonstrate how sustainability indicators can re-rank loan offers and identify green financing opportunities.
  • Created UI flows where underwriters can accept/reject Copilot recommendations and their decisions feed back into model retraining (human-in-loop).

What we learned

  • Hybrid detection works best: visual + semantic + historical similarity (vector DB) reduces false positives vs single-signal detectors.
  • Ground LLMs with structured facts — feed only the extracted flags, rule triggers, and model confidences; the LLM should synthesize, not infer new facts.
  • UX matters for trust: underwriters trust the system more when they can click a bounding box, see the reason, and view the exact evidence used in the Copilot explanation.
  • Sustainability signals are noisy — many SDG proxies are estimated (e.g., sector-level carbon), but they still add meaningful signal for portfolio-level decisions when used conservatively.
  • Inclusion needs alternate data — rural micro-enterprises can be scored with mobile/payment/agent data, but privacy and explainability are essential to avoid harm.

What's next for SME Loan Approval Agent assisted by AI/ML/RL

  • Ship SDG Credit Scoring v1: integrate public SDG datasets, carbon-intensity estimators, and vendor ESG scoring to produce sustainability_score and show trade-offs alongside credit_score.
  • Portfolio Advisor MVP: create an investor-facing UI that builds portfolios balancing expected return vs SDG impact, with visual trade-off curves and rebalancing suggestions.
  • Inclusion rails pilot: ingest non-traditional data for a small rural SME cohort (consent-driven), test microloan product outcomes.
  • Hardening for production: implement model governance (versioning, CI for model retraining), scale visual detector, and add RBAC + secure KMS signing.
  • Regulatory collaboration: work with compliance teams to formalize audit formats and produce an independent verification tool to validate signed audit exports.

File structure

Below is a recommended repo structure (concise and pragmatic) that supports development, testing, and extension into SDG & inclusion features.

untitled/
├── README.md
├── LICENSE
├── infra/
│   ├── terraform/                       # infra-as-code for S3, DB, KMS, lambda (optional)
│   └── k8s/                             # k8s manifests if deploying on cluster
├── api/                                 # backend services (FastAPI or Node/Express)
│   ├── app/
│   │   ├── main.py
│   │   ├── routes/
│   │   │   ├── upload.py
│   │   │   ├── ocr.py
│   │   │   ├── fraud_visual.py
│   │   │   ├── fraud_semantic.py
│   │   │   ├── score.py
│   │   │   └── copilot.py
│   │   ├── services/
│   │   │   ├── ocr_service.py
│   │   │   ├── visual_detector.py
│   │   │   ├── semantic_detector.py
│   │   │   ├── sme_scorer.py
│   │   │   └── compliance_engine.py
│   │   ├── models/                       # Pydantic models / schemas
│   │   └── utils/
│   │       ├── hashing.py
│   │       ├── kms_sign.py
│   │       └── pdf_export.py
│   └── requirements.txt
├── ui/                                  # React + Tailwind app
│   ├── package.json
│   ├── src/
│   │   ├── components/
│   │   │   ├── InvoiceViewer.jsx
│   │   │   ├── FraudOverlay.jsx
│   │   │   ├── CopilotPanel.jsx
│   │   │   └── ChaosButton.jsx
│   │   ├── pages/
│   │   │   ├── Dashboard.jsx
│   │   │   └── ApplicationPage.jsx
│   │   └── services/api.js
│   └── tailwind.config.js
├── ml/
│   ├── visual_detector/
│   │   ├── train.py
│   │   └── models/
│   ├── semantic_detector/
│   │   ├── features.py
│   │   └── train.py
│   ├── sme_scorer/
│   │   ├── feature_pipeline.py
│   │   ├── train.py
│   │   └── sdg_features.py                # SDG data ingestion & featurization
│   └── tests/
│       └── chaos_scenarios.py
├── scripts/
│   ├── generate_tampered_invoices.py     # Chaos Button scenarios generator
│   ├── export_audit_pdf.py                # converts JSON audit → signed PDF
│   └── synthesize_sdg_data.py
├── examples/
│   ├── invoices/                          # sample invoices (anonymized)
│   └── audits/                            # sample audit JSON + exported PDFs
├── docs/
│   ├── architecture.md
│   ├── compliance_checklist.md
│   └── sdg_integration.md
└── tests/
    ├── api_tests/
    └── integration_tests/

Key files explained

  • api/app/services/compliance_engine.py — implements rule checks (KYB, sanctions) and rule-to-LLM grounding.
  • ml/sme_scorer/sdg_features.py — module to map SDG indicators into scoreable features (sector carbon proxy, community impact metrics).
  • scripts/generate_tampered_invoices.py — creates the chaos scenarios used by the UI button and by tests.
  • api/app/utils/kms_sign.py — signs audit JSON with HMAC or KMS-protected keys for tamper-proof export.
  • ui/src/components/FraudOverlay.jsx — renders bounding boxes & metadata; click to reveal evidence used by the Copilot.
  • docs/sdg_integration.md — documents the methodology for mapping SDG datasets to loan decision logic.

How the repository ties to the three SDG-focused directions

  • Sustainable credit/risk analytics

    • ml/sme_scorer/sdg_features.py + docs/sdg_integration.md: ingest SDG indicators and produce sustainability_score.
    • api/app/services/sme_scorer.py returns both credit_score and sustainability_score in /score response; copilot.py references both when making recommendations (e.g., green-lending incentives).
  • Portfolio Advisors

    • Add api/app/routes/portfolio.py and a UI PortfolioAdvisor.jsx that take investor_preferences, risk_tolerance, and impact_goals and run an optimizer that trades off expected return vs SDG impact (use vector DB for investable SMEs).
  • Inclusion-oriented payments/credit rails

    • api/app/services/semantic_detector.py and ml/semantic_detector/features.py can be extended to accept non-traditional data inputs (mobile-payment logs sample schema). Add inclusion_pipeline.py to ML pipeline to train on such features.

Built With

  • fastapi
  • landingai
Share this project:

Updates