Our video got cropped, last minute, so would be grateful if you could have a look at our website - https://inferencetimecompute.vercel.app/
Inspiration
Finance has stayed locked behind a wall of credentialed humans trading on minute, decaying edges. Every other domain has fallen to models that out-think and out-scale the people who used to dominate it. Quantitative research is next.
What it does
Hidden Variables is an artificial research org, not a chatbot. A goal goes in. A society of agents plans, writes code, backtests, critiques itself, and converges on a thesis over long horizons measured in hours. Each agent runs its own headless model instance. A real-time steering interface lets a human drop into any agent mid-task, redirect it, and watch the rest of the society adjust. The output is a research process you can audit step by step, not a single reply.
How we built it
We fine-tuned our own model on financial reasoning using QDoRA instead of standard LoRA, taking a 7B Qwen base past the performance of models twice its size on hard quant tasks. We built our own benchmark to prove it: a held-out set of financial reasoning problems scored for correctness and reasoning quality, not just final answer. That model is what every agent in the society runs.
Each agent executes headless inside a sandboxed, containerized environment, so it can write and run real code against real market data without touching the host. We treat compute as close to infinite as we can make it: agents spawn sub-agents, run in parallel, and burn tokens on exploration most systems would consider wasteful, because on long horizons that exploration is where the edge shows up.
We layered Algo Traders on top for execution and backtesting, and built a full traceability system so every plan, tool call, and revision is logged and replayable. A long horizon run isn't a black box. It's a tree you can walk. STEER, our real-time interface, lets a human inject guidance into an agent's reasoning while it's still mid-task instead of waiting for it to finish.
Challenges we ran into
Sandboxing was the first wall. Getting Docker to run headless, parallel code execution safely in the cloud, fast enough to not bottleneck the whole society, took real iteration. Traceability got harder once dozens of agents were running long horizon tasks in parallel. We built our own trace schema instead of bolting on an existing one. QDoRA fine-tuning on Qwen 7B took several failed runs before a configuration beat larger baselines on our financial benchmark specifically, not general ones. Wiring STEER into a model already mid-inference, without breaking its state, was the hardest integration of the build.
Accomplishments that we're proud of
None of it is faked. The sandboxed execution runs real code. The traces are real traces. The QDoRA model beats baselines twice its parameter count on our own benchmark. STEER works on a live, running agent, not a scripted demo.
What we learned
Inference is the real bottleneck, not intelligence. The model was rarely the limiting factor. Orchestrating it across a society, over long horizons, with a human able to steer mid-flight, was. Finance punishes shortcuts faster than almost any other domain, which made it a brutal but honest test of whether multi-step agent systems hold up under real constraints instead of toy ones.
What's next for Hidden Variables
We're open sourcing the core so anyone can run it with their own provider keys, whether to power a personal trading bot or get real investment research done without a desk of analysts. From there we're developing proprietary techniques on our own foundation models and packaging the enterprise version for institutions that still think this kind of research requires a building full of people.
Built With
- agent-sdk
- devin
- dora
- h100
- nextjs
- python
- qlora
- typescript


Log in or sign up for Devpost to join the conversation.