Inspiration
High-frequency trading firms have long dominated markets with algorithmic speed, but AI-powered sentiment analysis has remained largely out of reach for them, since LLM inference is simply too slow to fit inside a sub-millisecond trading loop. At the same time, breaking news can move a stock by double digits in seconds, and no amount of technical analysis catches that. We wanted to bridge that gap: build a system that continuously processes live financial news in the background, so that by the time a trade decision needs to be made, the sentiment signal is already computed and ready. Logian is designed to be a drop-in sentiment layer for a larger trading engine, not a complete replacement, but a way to finally let AI-driven news analysis run alongside HFT strategies without sitting in the critical path.
What it does
Logian is a real-time stock sentiment analysis engine. It continuously scrapes breaking financial news from Yahoo Finance, generates semantic embeddings, and runs each headline through FinBERT (a BERT model fine-tuned on financial text), producing a confidence-weighted sentiment score from -1.0 to +1.0 per ticker. Those scores are always fresh and always ready, so a trading engine can query the current sentiment signal for any ticker at any time without waiting for inference to happen. It also ships a live React dashboard for human-readable monitoring (real-time score cards, BUY/HOLD/SELL labels, historical score charts), a CLI for one-off lookups, and a semantic search tool that lets you query stored articles by natural language across any ticker.
How we built it
The backend is a FastAPI server with asyncio background workers that run a continuous scrape → embed → store → score pipeline, decoupled from the request/response cycle entirely. Sentence embeddings (all-MiniLM-L6-v2, 384-dim) are stored in Actian VectorAI DB over gRPC for fast vector similarity search. FinBERT handles sentiment classification and runs in thread pool executors, so inference never blocks the event loop. The frontend is React + TypeScript + Vite, connected via WebSockets for live score push updates, with Recharts for bar charts and sentiment timeline graphs. The architecture is explicitly designed so that the sentiment layer runs independently — a trading engine could poll /api/scores and get sub-millisecond responses because the heavy AI work already happened in the background.
Challenges we ran into
Keeping inference off the critical path: The whole point of Logian is that AI doesn't slow down trading decisions. Structuring the pipeline so that embedding and FinBERT inference happen asynchronously in background workers — never blocking an API response — required careful use of FastAPI's lifespan, thread pool executors, and shared in-memory state. Rate limiting and scraper fragility: Yahoo Finance has no public API, so we built an RSS + HTML fallback scraper with retry logic and exponential backoff. Pages change structure without warning, and responses can be throttled under load. Integrating Actian VectorAI DB: The gRPC client was new to all of us. Getting batch upserts, collection health checks, and cosine similarity queries working correctly took significant trial and error.
Accomplishments that we're proud of
Building an architecture where AI inference is genuinely decoupled from the query path — sentiment scores are precomputed and served instantly, the way a real trading engine would need them. A fully end-to-end working pipeline, from raw Yahoo Finance HTML to a live BUY/SELL/HOLD signal, built in a single weekend. Semantic search across all stored articles using financial tickers, not just keyword matching, powered by vector similarity in Actian VectorAI DB. A clean async backend that stays responsive even while FinBERT is running inference on dozens of articles in the background.
What we learned
How domain-specific NLP models like FinBERT dramatically outperform generic sentiment models on financial text, and why that specificity matters for a trading signal. How vector databases work: embedding space, cosine similarity, L2 normalization, and when semantic search is more useful than keyword search. FastAPI's lifespan pattern for loading heavy ML models once at startup, and how to safely share state between background workers and WebSocket broadcast loops.
What's next for Logian
Trading engine integration: Expose a lightweight, low-latency REST or gRPC endpoint purpose-built for consumption by an HFT or algorithmic trading system — returning the current sentiment vector for any ticker in under a millisecond. More data sources: SEC filings, Reddit (r/wallstreetbets), Twitter/X, and earnings call transcripts to enrich the signal beyond headlines. Price correlation: Overlay sentiment scores against historical price data to validate and calibrate signal quality before live use. Alerting on sentiment shifts: Detect when a ticker's score crosses a threshold or moves sharply within a short window — the kind of event that often precedes a large price move. Persistent time-series storage: Replace in-memory score history with a proper time-series DB so signals survive restarts and can be backtested.
Built With
- actian
- css
- html
- javascript
- python
- react
- typescript
- vite
Log in or sign up for Devpost to join the conversation.