Inspiration

In Peru, a wealth advisor's morning looks like this: a WhatsApp photo of a CAVALI brokerage statement, a voice note from a client asking about their returns, and a stack of PDFs from three different custodians. None of it is structured. None of it talks to each other. And somewhere in that mess is a compliance violation or a revenue leak that nobody will catch until it's too late.

We built Valia because wealth management in Latin America still runs on manual data entry, and that costs firms real money.

What it does

Valia takes any financial artifact you throw at it and turns it into an actionable, auditable portfolio record. Upload a PDF statement from Credicorp, snap a photo of a CAVALI report, or drop in a voice recording of a client call. Valia extracts every holding, position, and transaction, then validates it, flags what's wrong, and tells the advisor exactly what to do next.

The full pipeline:

  1. Ingest anything - PDFs, scanned images, phone photos, screenshots, audio files, CSVs. The system classifies each artifact automatically and routes it to the right extraction path.

  2. Extract with evidence - Amazon Nova handles document understanding and OCR with word-level bounding box extraction, so every single data point traces back to its exact location in the source document. Page number, pixel coordinates, confidence score. This is critical for regulatory audits.

  3. Validate and reconcile - Locale-aware normalization handles Peru's number formatting (periods for thousands, commas for decimals). Duplicate detection, missing page detection, cross-account reconciliation. Every artifact gets a PASS, WARN, or BLOCK status.

  4. Generate revenue-at-risk signals - The system automatically detects portfolio drift vs. mandate, idle cash persistence, single-name concentration spikes, and stale data. Each signal is ranked by a weighted scoring algorithm: 50% materiality, 30% confidence, 20% freshness.

  5. Run compliance checks - A YAML-based policy engine evaluates 18 rules across portfolio constraints, process compliance, and sanctions screening (PEP, OFAC, UN, SBS). The output is a hard APPROVE, WARN, or BLOCK decision that gates the entire workflow.

  6. Draft client communications - Amazon Nova generates impact reports and client messages in multiple tones (formal, friendly, concise), then critiques every draft for compliance issues, clarity, and tone before an advisor approves it.

  7. Full audit trail - Every action from upload to approval is logged as an append-only event. Exportable audit packets satisfy SMV (Peru's securities regulator) requirements.

On top of all that: portfolio analytics (Sharpe ratio, VaR, alpha/beta, benchmark delta), a bilingual omnibox that accepts natural language commands in English and Spanish, statement Q&A grounded in extracted data, and call transcript summarization with intent classification.

How we built it

Frontend: Next.js 16 with React 19, TypeScript, and Tailwind CSS 4. Clerk handles authentication.

Backend: Convex powers the entire backend, handling real-time state sync, server-side actions, scheduled jobs, and database operations. It replaced our earlier prototype stack and gave us instant reactivity across the advisor dashboard without manClerk aging infrastructure.

AI Layer: Amazon Nova via Bedrock powers the entire intelligence layer. The callNova() and callNovaJSON() helpers provide structured output with temperature control for reliable JSON extraction. Nova handles multimodal document understanding, portfolio extraction from complex financial PDFs and images, word-level OCR with bounding boxes, draft generation and compliance critique, statement Q&A, and call transcript summarization with intent classification.

Infrastructure: Docker Compose for local development. GitHub Actions CI. S3-compatible storage layer.

Challenges we ran into

Latin American financial documents are a different beast. CAVALI statements from Peru have no consistent layout standard. The same data appears in different positions across custodians, number formatting flips between US and European conventions mid-document, and column headers mix Spanish and English. We had to build a locale-safe normalization layer that handles all of this before any AI extraction happens.

The evidence reference system was the hardest engineering challenge. Getting word-level bounding boxes from Nova's OCR output and mapping them back to extracted portfolio fields required building a proportional character width calculator and coordinate transformation pipeline. But it's the feature that makes Valia actually usable in a regulated environment, because auditors need to verify every number back to source.

Getting the policy engine right was also tricky. Different firms have different compliance rules, so we built it as a YAML configuration that can define parametric constraints (like $max_equity_pct pulled from the client's risk profile) without code changes.

Accomplishments that we're proud of

The evidence reference system is the standout. Every extracted field carries a {doc_hash, page, bbox, snippet, confidence} reference. An advisor can click any number in the portfolio view and see exactly where it came from in the original document, down to the pixel. No other wealth management tool we've seen does this.

The YAML policy engine with 18 rules covering portfolio constraints, compliance processes, and sanctions screening, all configurable per firm without touching code.

The signal ranking algorithm that weights materiality, confidence, and freshness to surface the most actionable revenue-at-risk items first.

And the end-to-end pipeline actually works: upload a blurry phone photo of a brokerage statement, and 30 seconds later you have a validated portfolio with ranked signals, a policy decision, and a draft client communication ready for review.

What we learned

The extraction is maybe 30% of the problem. Validation, normalization, reconciliation, and building trust through evidence references is where the real engineering lives. Anyone can call an LLM to parse a PDF. The hard part is making the output reliable enough that a compliance officer will sign off on it.

Amazon Nova's multimodal capabilities turned out to be particularly strong for table-heavy financial documents. The bounding box output gives us spatial awareness that pure text extraction misses entirely, and the structured JSON output mode made it possible to build a reliable extraction pipeline without fragile post-processing.

What's next for Valia

Real-time voice coaching using Amazon Nova Sonic for live advisor-client calls. Expansion to Colombia, Chile, and Mexico with their respective regulatory frameworks. Automated trade execution recommendations based on detected drift signals. And a mobile app so advisors can snap documents and get instant portfolio intelligence on the go.

Built With

Share this project:

Updates