Blink (Terminal)

Canvas View terminal
browser view integrated
shell view with chatgpt
Money view
Requirements view
Memory Integrated
Research integrated
Requirement in team view
Memory in team view

Inspiration

Modern development means juggling a dozen terminal tabs, context-switching between AI coding tools, and manually stitching together the work they produce. We were inspired by infinite-canvas tools like CNVS and by the rise of CLI coding agents (Codex, Claude Code): what if the terminal stopped being a flat list of tabs and became a living workspace you could talk to — where named agents work side by side and their output integrates itself?

What it does

Blink is an intelligent desktop terminal for coders and "vibecoders." Instead of managing terminals, you manage intent.

Infinite canvas of dark-glass terminal windows, each inhabited by a named coding agent (e.g. "Jarvis," "Friday") running a real, fully interactive xterm.js session.
Natural-language command bar: you describe what you want, and an orchestrator translates it into actions — one command, one action.
Invisible multi-agent integration: agents work in parallel on separate git worktrees and their work auto-merges, with a dedicated integrator agent resolving conflicts. No more babysitting branches.
Built-in sections: Requirements (cards on the canvas + AI document import), Screenshots (automatic watcher on the macOS screenshot folder), live app Preview, agent monitoring, and a Research section.
Hands-free control: webcam + MediaPipe gesture recognition — move the cursor, pinch to click, swipe between sections.

How we built it

Blink is an Electron + React + TypeScript desktop app (Vite, Zustand, Tailwind), macOS-first. The architecture cleanly separates the renderer (canvas, command bar, dock, sections) from the Node main process through typed IPC.

The heart is an isolated Agent Engine (PtyManager, SessionMonitor, WorktreeManager) that spawns Codex CLI sessions inside embedded PTYs via node-pty. Agent status is read non-invasively by polling the session files Codex writes to disk — never by intercepting the PTY. The orchestrator calls the OpenAI API with function calling: it never executes anything itself, it only turns language into structured calls the app runs. Project state (requirements, canvas layout, assignments, usage) persists as lightweight JSON.

Challenges we ran into

Embedding real PTYs in Electron (node-pty native rebuilds across Electron versions) and keeping terminals responsive on a free-floating canvas.
Making multi-agent integration truly "invisible" — orchestrating git worktrees and automatic merges without agents stepping on each other.
Tracking agent state reliably without hijacking the terminal, by reading growing session files on a polling loop.
Wiring webcam gesture control (MediaPipe + local WASM/model assets) into the UI smoothly enough to actually be usable.

Accomplishments that we're proud of

A working infinite canvas with live, interactive agent terminals.
A natural-language command bar backed by a clean "one command = one action" orchestration model.
Automatic, conflict-aware multi-agent merging that mostly stays out of your way.
Hands-free gesture navigation as a genuine, fun control mode — not just a demo.

What we learned

How to architect an Electron app for clean isolation (typed IPC, a self-contained agent engine that could later become a standalone daemon), how to monitor CLI agents without interfering with them, and how to design AI orchestration where the model decides what to do while the app stays firmly in control of execution.