💡 Inspiration
Retail investors are currently drowning in data but starving for insights. We noticed that most traders juggle between four different tabs: a charting platform, a news terminal, a social feed, and an AI chat. This fragmentation leads to "analysis paralysis". We wanted to build a single, unified interface where a "Live Agent" actually looks at your screen, listens to your voice, and talks back to you like a real human analyst.
🚀 What it does
FinAgent is a multimodal "Live Financial Analyst" that leverages Gemini 2.0 Flash's low-latency capabilities.
- Vision: It can "see" and analyse technical stock charts instantly to identify patterns (RSI, MACD, support/resistance).
- Audio: It features bi-directional, real-time audio interaction. You don't type; you just talk to the market.
- Intelligence: It parses real-time financial news through NewsAPI and connects the dots between global events and local market price action.
🛠️ How we built it
The project is built with a modern, scalable stack:
- Backend: Python with FastAPI, utilising WebSockets for real-time data streaming.
- Frontend: React and TypeScript with a premium, high-performance UI.
- AI Core: Google Gemini 2.0 Flash Multimodal Live API for synchronised vision and audio.
- Infrastructure: Containerised using Docker and deployed on Google Cloud Platform (Cloud Run) for high availability.
🧠 Challenges we faced
Syncing real-time bi-directional audio while maintaining low latency was the biggest hurdle. We had to optimise the WebSocket buffers to ensure the agent’s responses felt instantaneous. Mapping visual tokens from complex financial charts to accurate reasoning also required significant prompt engineering.
🏆 Accomplishments that we're proud of
We successfully built a system where the AI can "think" while "listening". The transition from seeing a chart to speaking about it happens in under 500ms, making it feel like a truly live conversation.
📖 What we learned
We learnt the immense power of the Multimodal Live API. It’s not just about chat; it’s about grounding AI in visual and auditory reality. We also gained deep insights into optimising GCP deployment for real-time agentic workflows.
⏩ What's next for FinAgent
Next, we plan to integrate direct portfolio execution APIs and expand the agent’s vision to monitor multiple screens simultaneously, providing a true "cockpit" experience for retail traders.
Built With
- cloud-run
- docker
- fastapi
- gemini-2.0-flash
- google-cloud
- multimodal-live-api
- newsapi
- python
- react
- typescript
- websockets
Log in or sign up for Devpost to join the conversation.