Inspiration

“Financial analysis is still manual, fragmented, and slow — despite the data being digital.”

Financial analysts are expected to make fast, high-stakes investment decisions, but most of their time is spent collecting and cleaning data from filings, PDFs, spreadsheets, and news instead of generating insights. Analysts spend up to 70% of their time on manual data work, leading to:

Delayed investment decisions

Focus on low-value tasks instead of strategic thinking

Lack of institutional-grade research access for smaller firms and individual investors

As a result: Investment decisions are delayed Analysts focus on low-value tasks instead of strategic thinking Smaller firms and individual investors lack access to institutional-grade research capabilities There is a clear need for an AI-powered system that automates financial data gathering, analysis, and reporting, enabling analysts to focus on insights rather than information hunting.

Solution: An AI-powered system that automates financial data gathering, analysis, and reporting, allowing analysts to focus on insights rather than information hunting.

What it does

FinSight Co-Pilot is an AI-powered multi-agent platform that delivers institutional-grade financial analysis in seconds. Built with Google AI Studio and Gemini 3, it automates complex workflows, turning raw financial data into structured, decision-ready insights through a simple chat interface.

Specialized Agents

Document Intelligence Agent: Reads 10-Ks, earnings transcripts, and financial reports to extract metrics, risks, and disclosures

Market Analysis Agent: Pulls real-time prices, fundamentals, and financial indicators, explaining their significance

Investment Thesis Agent: Generates complete buy/sell reports with bull/bear cases, catalysts, and price targets

Risk Assessment Agent: Quantifies company risk and flags hidden red signals across multiple categories

Sentiment Analysis Agent: Analyzes news, earnings calls, and text to detect market sentiment shifts

Orchestrator Agent: Coordinates all agents, routes queries intelligently, and combines outputs into a unified analysis

Impact: Reduces hours of manual work into seconds and makes professional-grade research accessible to everyone.

How we built it

FinSight Co-Pilot uses a multi-agent architecture on Google Cloud, orchestrated by Gemini 3 for institutional-grade insights.

Tech Stack

Frontend: Streamlit + Plotly (interactive dashboards & charts)

AI/ML: Gemini 3 + specialized agents (market, document, sentiment, risk, report)

Data Sources: Yahoo Finance, SEC EDGAR, financial news

Cloud: Firestore (chat/session), BigQuery (analytics), Cloud Storage (PDFs), Cloud Pub/Sub (notifications), Cloud Run (deployment)

Backend: Python 3.11+, Pandas, PyPDF2, BeautifulSoup, lxml Key Features:

Natural Language Queries: Ticker extraction, intent classification, context-aware responses

Real-Time Market Data: Live prices, historical charts, peer comparisons, analyst consensus

SEC Filing Intelligence: Automated retrieval, XBRL parsing, multi-year trend analysis

Interactive Dashboard: Watchlist tracking, visual metrics, responsive layout

Cloud-Native Persistence: Firestore, BigQuery, Cloud Storage, Pub/Sub notifications

Professional Reports: Bull/bear case analysis, investment thesis, Markdown export

Development Approach:

Modular design with clear agent responsibilities

API abstraction and caching to reduce redundant calls

Error handling with retries & exponential backoff

Responsive UI with progress indicators

Docker-based containerization for consistent deployment

Agentic Flow: A user query (e.g., "Generate investment thesis for Tesla") follows a 6-step process:

User Query Extract Tickers (Gemini LLM call) Classify Intent (Gemini LLM call) Route to Agent (e.g., ReportAgent) Fetch Data (e.g., Yahoo Finance, Yahoo News) AI Analysis (Gemini generates structured thesis)

Challenges we ran into

Several technical challenges were overcome during development:

Rate Limiting: Gemini's API occasionally returned 429 errors. This was mitigated using a resilient Singleton GeminiClient with exponential backoff (4 retries) and a PyPDF2 text extraction fallback for document analysis, preventing complete application failures.

Agent Personas: Achieving high-quality, consistent output required carefully crafted system instructions for each agent, establishing specialized personas (e.g., "You are an expert risk analyst") to dramatically improve results over generic prompts.

Real Data Complexity: Integrating live data from Yahoo Finance and SEC EDGAR introduced complexity. Yahoo Finance sometimes returned error dictionaries; this was handled with 3x retry logic and meaningful-key validation to manage transient API failures.

Streamlit State Management: Navigating between pages in Streamlit required careful session state handling. A pre-render target_page pattern was implemented to resolve issues with modifying widget keys after instantiation.

Context Window Strategy: Dealing with massive SEC XBRL data required a strategic approach to context windows. Only the 13 most important GAAP metrics, limited to 3 years of history, are extracted.

PDF fallbacks truncate content to 60K characters, prioritizing context quality over quantity. Structured Prompts: It was found that providing real data in a structured format and requesting structured output (e.g., numbered sections, tables) produced significantly better results than open-ended prompts.

API Rate Limiting & Reliability Challenge: Both Yahoo Finance and SEC EDGAR APIs have strict rate limits and occasional transient failures. Solution: Implemented exponential backoff retry logic (4 retries with increasing delays) in the Gemini client and data providers. Added caching mechanisms to reduce redundant calls.

XBRL Data Complexity Challenge: SEC XBRL data is highly nested and inconsistent across companies. Different companies use different GAAP tags, and data formats vary (USD, USD/shares, shares). Solution: Built a robust XBRL parser that handles multiple unit types, filters for annual vs. quarterly data, and gracefully degrades when standard tags are missing.

Multimodal PDF Processing Challenge: Gemini's file API requires specific handling for PDFs, and large documents can exceed token limits. Solution: Implemented dual-path processing: use Gemini's native file upload API first, fall back to PyPDF2 text extraction with truncation (60K chars) if needed. This ensures reliable document analysis regardless of file size.

GCP Service Initialization Challenge: Local development requires gcloud auth application-default login, while Cloud Run deployment uses service accounts. Firestore database naming and BigQuery dataset creation needed automated setup. Solution: Created a comprehensive GCP client initialization system that:

Auto-creates BigQuery datasets and tables with proper schemas Handles Firestore database selection Sets up Pub/Sub topics and subscriptions Provides graceful degradation when services are unavailable

Agent Orchestration Complexity Challenge: Coordinating multiple agents to answer complex queries (e.g., "Compare NVDA and AMD on growth, then assess risks") required sophisticated routing logic. Solution: Developed an intent classification system using Gemini that maps queries to agent workflows. Implemented callback mechanisms for real-time status updates during multi-agent processing.

State Management in Streamlit Challenge: Streamlit's rerun model made it difficult to maintain analysis state between user interactions. Solution: Leveraged st.session_state extensively with prefixed keys (e.g., ca_metrics, ca_ticker) to persist data across reruns. Used conditional rendering to avoid re-running expensive API calls.

UI/UX for Financial Professionals Challenge: Creating a design that feels professional and institutional-grade while using Streamlit's widget constraints. Solution: Extensive custom CSS with gradient metric cards, color-coded agent status indicators, and responsive layouts. Implemented light-themed sidebar for readability.

BigQuery Schema Evolution Challenge: As features evolved, new columns needed to be added to BigQuery tables without breaking existing data. Solution: Implemented schema migration logic that detects missing columns and updates table schemas dynamically while preserving existing data.

Error Messaging & Debugging Challenge: Financial data APIs return inconsistent error formats (sometimes nested dicts, sometimes error keys within valid responses). Solution: Added defensive checks throughout data providers to detect both explicit errors and empty/invalid data. Provide user-friendly error messages while logging technical details.

Deployment Configuration Challenge: Environment variables needed for both local development (.env) and Cloud Run deployment. Solution: Used python-dotenv for local development and Cloud Run environment variable injection for production. Created comprehensive setup documentation with required GCP APIs and permissions.

Accomplishments that we're proud of

Accomplishments We Are Proud Of End-to-End Multi-Agent AI System We built a fully functional orchestrator + 5 specialized agents architecture that rivals institutional trading platforms. The agents collaborate seamlessly to answer complex financial queries.

Real SEC Filing Analysis Successfully integrated SEC EDGAR's XBRL API to extract structured financial facts going back multiple years. This provides audited, official financial data rather than relying solely on third-party aggregators.

Professional-Grade UI Created a Streamlit application that doesn't look like Streamlit - institutional design with custom CSS, gradient cards, responsive layouts, and smooth user flows. The UI can compete with commercial financial platforms.

Cloud-Native Architecture Fully integrated 4 GCP services (Firestore, BigQuery, Cloud Storage, Pub/Sub) with automated infrastructure setup, proper error handling, and seamless deployment to Cloud Run.

Intelligent Query Understanding The orchestrator successfully extracts tickers from natural language ("Is Apple overvalued?" → AAPL) and classifies intent to route queries appropriately. This creates a ChatGPT-like experience for financial analysis.

Comprehensive Financial Coverage Market Data: 30+ metrics per company (valuation, profitability, growth, returns) Document Intelligence: 10-K, 10-Q, 8-K parsing with AI-powered Q&A Risk Assessment: Multi-dimensional risk scoring across 6 categories Sentiment Analysis: News aggregation with bull/bear factor extraction Investment Reports: Institutional-grade theses with price targets

Robust Error Handling Built a resilient system with retry logic, graceful degradation, and informative error messages. The app continues functioning even when individual data sources fail.

Real-Time User Feedback Implemented agent status callbacks that show users exactly what's happening ("Market Agent: Fetching data for AAPL...") during long-running analysis tasks.

Peer Comparison Features Developed side-by-side comparison tables and charts for multiple companies across 15+ metrics, enabling quick competitive analysis.

Production-Ready Deployment Created a Dockerfile and Cloud Run configuration that makes the app deployable with a single command. Included comprehensive README with setup instructions.

Quantified Impact:

10x Faster Research 60% Time Saved 6 Specialized AI Agents 41 Financial Metrics Why it stands out:

Multi-agent architecture, not a single-prompt chatbot. Uses "Real data from SEC EDGAR + Yahoo Finance (not hallucinated)". Intelligent routing via Gemini. Resilient design with retry logic and fallbacks. Professional, institutional-quality output. Multimodal processing of PDFs (charts and tables). "8 interactive pages -- full-featured app, not just a chat window". Gemini 3.0 Flash Usage:

Ticker extraction and intent classification. Financial analysis with real-time data context. Multimodal PDF document understanding. "6 specialized agent personas via system instructions".

What we learned

Adopting the A2A protocol allowed for a cleaner separation of concerns. The Orchestrator transitioned into a "Root Agent," while specialized entities (Document, Market, Sentiment, Report) became independent services. What We Learned Technical Learnings Gemini API Capabilities

Gemini 3 Flash is extremely fast and cost-effective for financial analysis tasks The multimodal file API handles PDFs natively, eliminating need for complex preprocessing System instructions are crucial for maintaining agent personalities and output consistency Temperature settings matter: 0.0-0.3 for factual analysis, 0.5-0.7 for creative report writing Financial Data API Challenges

Yahoo Finance is unreliable - needs retry logic and error handling for production use SEC EDGAR XBRL data is powerful but complex - different companies use different tags User-Agent headers are mandatory for SEC API compliance CIK (Central Index Key) lookups add latency - caching is essential Streamlit Advanced Patterns

st.session_state is essential for complex apps but requires careful key management Custom CSS can transform Streamlit's default look completely Lazy loading with @st.cache_resource dramatically improves performance Conditional rendering prevents unnecessary API calls on reruns

  1. GCP Service Integration

Firestore in Native mode requires database selection - default database may not exist BigQuery schemas should be created programmatically to avoid manual setup Pub/Sub subscriptions need BigQuery write permissions and proper service accounts Cloud Storage requires bucket creation and IAM permissions for service accounts Multi-Agent Orchestration

Intent classification is harder than it seems - requires clear category definitions Agent collaboration needs well-defined interfaces and data contracts Callback functions provide essential user feedback for long operations Context management is crucial - agents need to share relevant data efficiently Product & Design Learnings Financial Professional UX Expectations

Speed matters - users expect sub-3-second responses for market data Data provenance is critical - always cite sources (Yahoo Finance, SEC, Gemini analysis) Professional design builds trust - gradients, clean layouts, and consistent color schemes Error transparency - show what went wrong rather than generic "something failed" messages Information Hierarchy

Metrics should be scannable - use cards, color coding, and clear labels Progressive disclosure - show summary first, details on demand Comparison is king - financial analysts always want peer comparisons Visual > Tables - charts communicate trends faster than raw numbers AI in Finance Use Cases

Document Q&A is the killer feature - saves hours of manual SEC filing review Sentiment analysis provides value when combined with quantitative data Risk assessment benefits from AI's ability to synthesize multiple data sources Investment theses need structured templates - pure LLM output is too variable Process Learnings Iterative Development

Start with one agent working well before adding complexity Frequent testing with real queries exposes edge cases faster than unit tests UI polish should come last - get functionality right first API-First Design

Abstract data sources early - switching between APIs becomes trivial Mock data for development avoids hitting rate limits Error contracts should be consistent across all data providers Documentation Importance

Good README is essential for GCP projects - many setup steps required Code comments for financial logic - formulas and ratios need context Environment variable examples prevent deployment headaches Cloud Development Workflow

Test locally with gcloud auth application-default login before deploying Cloud Run logs are essential for debugging production issues Incremental cloud testing - test each GCP service individually before integrating

🚀 What's Next for FinSight Co-Pilot Immediate Enhancements (Next 2-4 Weeks) Enhanced Data Sources

Bloomberg API Integration: Add institutional-grade data for enterprise customers Alpha Vantage: Forex, crypto, and international market data Earnings Call Transcripts: Automatic ingestion and sentiment analysis Social Media Sentiment: Reddit, Twitter/X sentiment tracking for meme stocks Advanced Analytics

Portfolio Optimization: Modern Portfolio Theory (MPT) implementations Monte Carlo Simulations: Risk modeling and scenario analysis Backtesting Engine: Test investment strategies against historical data Correlation Analysis: Identify portfolio diversification opportunities Collaboration Features

Team Workspaces: Shared watchlists and analysis reports Commenting System: Annotate reports and charts Report Versioning: Track analysis changes over time Slack/Teams Integration: Push alerts to communication platforms Medium-Term Features (1-3 Months) Custom Agent Builder

Agent Marketplace: Users can create and share custom analysis agents Workflow Automation: Chain multiple agents for complex analysis pipelines Custom Data Connectors: Import proprietary datasets Agent Fine-Tuning: Customize agent behavior with user-provided examples Real-Time Capabilities

Live Price Streaming: WebSocket integration for real-time quotes Alert System: Price targets, news mentions, SEC filing notifications Earnings Calendar: Automatic analysis on earnings release dates Market Hours Dashboard: Pre-market and after-hours activity tracking Expanded Document Intelligence

Quarterly Earnings Analysis: Automatic 10-Q comparisons quarter-over-quarter Conference Call Analysis: Transcribe and analyze earnings calls Credit Reports: Moody's/S&P rating integration Regulatory Filings: 13F, 13D/G tracking for institutional investor moves Long-Term Vision (3-6 Months) Institutional Features

Private Deployment: On-premise or VPC deployment for compliance Audit Trails: Complete activity logging for regulatory compliance Role-Based Access Control (RBAC): Team permission management API Access: Programmatic access for algorithmic trading systems White-Label Solution: Customizable branding for financial institutions AI Advancements Predictive Analytics: ML models for price prediction (with appropriate disclaimers) Anomaly Detection: Automatic red flag identification in financial statements Global Expansion International Markets: Support for LSE, TSE, SSE, NSE stock exchanges Multi-Currency: Real-time forex conversion and reporting Regional Compliance: GDPR, SOC 2, FINRA compliance certifications Localization: Multi-language support for global users

Monetization & Scale Freemium Model: 10 queries/day free, unlimited with subscription Professional Tier: Advanced analytics, real-time data, team features ($99/mo) Enterprise Tier: Custom deployment, SLA guarantees, dedicated support ($999/mo) API Pricing: Pay-per-query model for programmatic access Research & Innovation Experimental Features

Quantum-Resistant Security: Future-proof encryption for sensitive financial data Decentralized Data: Blockchain-based audit trails for analysis provenance Voice Interface: Natural language voice queries for hands-free analysis

Emotional Intelligence: Detect analyst bias in reports and news Community & Ecosystem

Open-source contributions: agent framework, financial data connectors, pre-built templates, tutorials & certifications

Mission: FinSight Co-Pilot democratizes institutional-grade financial analysis, giving individual investors, analysts, and small firms access to AI-powered insights previously reserved for Wall Street.

Built With

Share this project:

Updates