Inspiration
Most AI assistants answer from memory. We wanted to build something that actually goes out and gets the answer, opens real browsers, reads real pages, and returns structured data you can act on.
The specific problem: comparing products across multiple retailers is tedious. You open five tabs, scan prices and reviews, try to remember what you saw two tabs ago. We wanted to reduce that to a single voice command.
What it does
Evoco takes a natural language command, typed or spoken, and dispatches a fleet of parallel browser agents to execute it. Say "find the best laptop under $800 from Amazon, Best Buy, and Newegg" and Evoco:
- Uses Nova 2 Lite to decompose that into a dependency graph of browser steps
- Launches parallel Nova Act agents, one per site, running concurrently
- Extracts structured JSON from each page using schema-validated act_get() calls
- Feeds all results to a Nova 2 Lite reasoning step that compares, ranks, and summarizes
- Streams every step live to the UI via WebSocket. You watch the graph execute in real time
A 3-site query that would take roughly 10 minutes sequentially completes in around 3.5 minutes.
How we built it
The backend is a FastAPI async server orchestrating three Nova models. Nova Sonic handles voice input with streaming transcription. Nova 2 Lite does all the reasoning: planning the task graph, re-planning when steps fail, and repairing malformed browser output. Nova Act drives real browsers, navigating, searching, and extracting structured data.
The execution engine is a DAG executor built on asyncio. Independent site branches run as concurrent tasks; steps within a branch are sequential. Browser sessions are managed by a semaphore-bounded pool.
The frontend is React and TypeScript with a live three-panel layout: command input on the left, a DAG visualization in the center that updates node by node as steps complete, and structured results on the right.
Challenges
Async/sync boundary. Nova Act uses Playwright under the hood, which is synchronous. Running it inside an async FastAPI handler blocks the event loop. We solved this by wrapping all Nova Act calls in asyncio.to_thread(), keeping the server responsive while browsers run in thread pool workers.
Result reliability. Browser agents return free text that may or may not be valid JSON. We built a 4-strategy fallback parser: schema validation, then json.loads, then regex extraction, then LLM repair via Nova 2 Lite. Even a partially garbled browser response gets recovered rather than dropped.
Adaptive re-planning. When a branch fails, re-running the same plan against the same broken site is pointless. Nova 2 Lite receives the failure context and generates an alternative approach with different sites or different search strategies.
What we learned
Orchestrating multiple AI models with real dependencies between them is an architecture problem as much as an AI problem. The DAG abstraction turned out to be the right primitive. It gives you parallelism, dependency enforcement, and a natural structure to visualize, all from a single design decision.
Built With
- asyncio
- fastapi
- grafana-pyjwt
- nginx-prometheus
- python
- result-cache
- tailwind-css-react-flow-(dag-visualization)-redis-(task-store
- typescript
- user-store)-docker
- vite
- websocket-amazon-nova-act-(browser-automation)-amazon-nova-2-lite-via-aws-bedrock-(planning-+-reasoning)-amazon-nova-sonic-(voice-transcription)-react-19
Log in or sign up for Devpost to join the conversation.