OmniLoop: The Autonomous SWE Interviewer
OmniLoop is a real-time AI interviewer built around a simple belief: interviews are not chat.
They’re a live, high‑bandwidth evaluation loop — voice, interruptions, screens, code, diagrams, tradeoffs, and evidence.
So we built OmniLoop as a bi-directional streaming system, not a “prompt + response” app. The interviewer speaks first, listens in real time, and stays grounded in what the candidate is actually doing on screen. When the question demands it, OmniLoop opens the right surface — a coding IDE or a system design board — and it keeps the evaluation fair by default: probe and assess, don’t teach.
Main highlights (why this is different)
1) Audio BiDi streaming that feels like a real round
Most “voice agents” still behave like turn-based chat. OmniLoop doesn’t.
- continuous microphone PCM streaming → backend → live interviewer
- streaming audio back → browser playback
- interruption handling, so the loop stays natural
2) Vision streaming (screen share) that adds signal, not noise
OmniLoop sends periodic screenshots from the shared screen as vision input. The key is restraint: no narrating frames, no “we see a cursor moving.” Instead, the interviewer uses vision as background context and brings it up only at natural breaks:
- “We can see you’re in the repo — open the entrypoint and the main service boundary.”
- “Let’s look at how you deploy this — show the config and walk through failure modes.”
It’s closer to how a human interviewer actually uses a shared screen.
3) Resume + role-aware interviews via an auditable RAG pipeline
Generic questions aren’t fair to candidates and aren’t useful for evaluation. OmniLoop ingests:
- resume (PDF → parse → chunk → embed)
- job description (chunk → embed)
Then it exposes explicit tools:
resume_search(query)role_search(query)
When these tools are used, the UI shows citations. That single design choice changes the trust profile:
- claims become checkable
- hallucinations become easier to spot
- “in-context” feels real, not implied
4) Tool-driven “interview surfaces” (the right workspace at the right time)
Coding
When a coding question is asked, OmniLoop opens an in‑browser Python IDE (Monaco + Pyodide). Candidates run code locally, see structured pass/fail results, and send code + tests back as evidence.
System design
For architecture questions, OmniLoop opens a drawing board (Excalidraw). It sends:
- a snapshot (so the model can see the diagram)
- a structured scene summary (nodes / flows / boundaries) so reasoning isn’t dependent on pixels
This makes system design legible to the model and reviewable for humans.
5) A strict evaluator stance (fairness-first)
One of the biggest problems with AI interviewers is accidental teaching. OmniLoop is strict by default:
- ask one question at a time
- probe tradeoffs, correctness, and reasoning
- do not reveal solutions
- explain only if the candidate explicitly asks or clearly says “I don’t know”
That preserves evaluation signal, which is the whole point.
6) Async scoring pipeline with Redis Streams
The interview loop must stay low-latency. Scoring can be async. We separate them:
- live interviewer session (real-time)
- scoring worker (async) via Redis Streams
The worker returns strict JSON (rubric dimensions + gates + decision band), and the frontend displays a rolling decision panel in real time.
7) “Internal dataset” trail: events + artifacts you can learn from
If you want enterprise-grade systems, you need more than a transcript. OmniLoop records:
- transcript events
- tool calls / grounding events
- board updates (summary + optional scene)
- code submissions + test results
- screenshots/artifacts
- scores + decisions
This becomes the foundation for continuous improvement: calibration, rubric tuning, evaluation replay, and building internal datasets without guessing what happened.
8) Enterprise report output
At the end of an interview, OmniLoop generates a PDF report and can email it to an admin address — not as a gimmick, but as a closing loop: the interview produces something reviewable.
End-to-end flow (what the user experiences)
Landing → Setup
Configure role/level/focus/company. Upload resume + job description/context.Interview
Live audio BiDi stream + screen share vision stream. Interviewer initiates and runs a strict, question-driven loop.Auto-surfaces
Coding question → IDE opens.
System design question → architecture board opens.Scoring + decision panel
Worker scores answers asynchronously; UI shows rolling decision state.Finalize
Persist artifacts/events + generate PDF report + optional email.
What we learned building it
The model is powerful, but the product is in the orchestration: low-latency streaming, interruption handling, clean tool flows, auditable grounding, and a persistence trail that turns “a conversation” into “an evaluable system.”
OmniLoop is our bet that the future of interview prep (and eventually evaluation) is not a smarter chat box — it’s a live, multimodal loop with evidence.
Built With
- canvas-api
- cloud-sql-(postgres)
- docker
- docker-compose
- excalidraw
- fastapi
- google-cloud-run
- google-cloud-storage-(gcs)
- google-cloud-vertex-ai-(gemini-live-api)
- google-genai-sdk
- javascript
- monaco-editor
- psycopg
- pyodide
- python
- react
- react-router
- redis-streams
- reportlab
- smtp
- sqlalchemy
- uvicorn
- vite
- web-audio-api
- webrtc
- websockets
Log in or sign up for Devpost to join the conversation.