CognitoBIZ AI— The Governed AI Chief of Staff for Startups What We Built Most startups don't fail because founders lack ambition. They fail because nobody was watching the numbers closely enough, early enough. CognitoBIZ is an AI Chief of Staff that does exactly that — observing your financial reality, advising on decisions, executing approved actions, and managing vendor relationships, all with a complete immutable audit trail and enterprise-grade AI governance built in.

It's not a chatbot on top of a spreadsheet. It's a full agentic system where every AI agent has a cryptographically issued identity, every consequential action requires human approval, and every payment is settled on-chain.

What Inspired Us:- We kept seeing the same pattern: founders drowning in financial noise, missing the signal. An AWS bill quietly tripling after a launch. A vendor contract renewing with an auto-renewal clause nobody noticed. A freelancer getting paid before their work was actually verified. These aren't catastrophic failures — they're death by a thousand small decisions made without context.

We wanted to build something that sits between the chaos of raw financial data and the clarity a great CFO would bring — but governed, auditable, and affordable for a 6-person seed-stage team.

How We Built It:- The architecture has five layers working together:

Financial Reality lives in MongoDB Atlas as the live operational store — transactions, contracts, approvals, audit logs. Snowflake holds the analytics layer, joining our company data against Cybersyn economic indicators and BLS salary data from the Marketplace for peer benchmarking.

Intelligence is powered by Gemma 4, which handles four distinct jobs: anomaly detection on incoming transactions, peer benchmark analysis, runway simulation narratives, and vision-based document extraction from invoices, contracts, and proposals. Every Gemma call includes a constitutional prompt — hard-coded constraints that prevent the model from recommending metric manipulation, data deletion to improve numbers, or unilateral financial action.

Identity and governance run through Auth0. Every human user, every vendor, and every AI agent has a scoped identity. The CFO agent can read financials and draft reports but cannot execute payments. The payment agent can execute payments only after a human approval has been recorded. This HITL engine intercepts every Tier 3 action — anything with real-world financial consequences — before it runs.

Payments and audit are settled on Solana Devnet. WorkContracts lock the full project value in escrow on creation. Each milestone approval triggers a cryptographically signed release to the vendor's wallet. Every action — approval, block, payment, guardrail trigger — also writes a memo transaction to Solana, creating an immutable audit chain that lives outside our own database.

Voice is powered by ElevenLabs. Each morning, Gemma 4 writes a 60-second briefing script from the previous day's data — conversational, not robotic — and ElevenLabs speaks it. Founders can also ask questions by voice and hear answers spoken back, with full financial context passed to Gemma on every query.

The WorkContracts Workflow:- This is the flagship feature. A founder describes a task in plain English. Gemma 4 generates a complete structured contract — milestones, values, timelines, evidence requirements — and cross-references the budget against BLS market rate data. The founder edits and approves. Value is locked in Solana escrow. The vendor gets a scoped portal showing only their contract. When they submit milestone evidence, Gemma reviews it against the original requirements and produces a verification report. The founder approves. Payment releases on-chain. Every step is logged in both MongoDB and Solana. At completion, a full audit report is generated automatically.

The Guardrail System:- This is what separates CognitoBIZ from an AI that just does what it's told. We implemented a Goodhart's Law detector — a secondary Gemma pass that checks every cost-reduction recommendation to see if it improves a metric by eliminating what's being measured. "Disable CloudWatch monitoring to reduce AWS costs" would reduce the AWS line item by removing visibility, not by reducing actual spend. The detector catches this, suppresses the recommendation, and logs a guardrail event. The live guardrail feed shows blocked actions, flagged recommendations, and HITL gates in real time — the governance layer is visible, not hidden.

Challenges We Faced:- Merge conflicts ate hours we didn't have. At one point our Gemma service, WorkContract generator, and voice assistant were all simultaneously broken — one teammate had replaced live API calls with hardcoded mock data without resolving conflicts on his branch. Debugging three independently broken features at 2am while tracking down port mismatches between the Next.js proxy layer and the FastAPI backend was a special kind of fun.

The Snowflake Marketplace data took longer to understand than expected. The free-tier FINANCIAL_ECONOMIC_INDICATORS dataset contains macroeconomic signals — CPI, labor costs, GDP — not per-company spend data. We had to rearchitect the benchmarking layer to use these macro signals as scaling factors against seed-stage baseline numbers rather than trying to extract company financials that don't exist in that dataset.

Getting Gemma 4 to reliably return structured JSON — not prose with JSON embedded in it — required careful prompt engineering and a REST fallback path when the SDK version didn't support the model string.

What We Learned:- Governance isn't a feature you bolt on at the end. Building the HITL engine and constitutional constraints first made every subsequent AI feature cleaner to implement — the guardrails were already there. When you design the safety layer as infrastructure rather than afterthought, the rest of the system builds around it naturally.

We also learned that the most impressive demos aren't the ones with the most features. They're the ones where every feature actually works and connects to every other feature. A payment released on Solana that simultaneously updates the MongoDB audit log, triggers an ElevenLabs notification, and appears in the guardrail feed — that coherence is harder to build than any individual feature, and it's what makes the system feel real.

Built With

Share this project:

Updates