Asist0 - Agentic Workspace

architecture-diagram

Asist0 - Agentic Workspace

A voice-first AI assistant built with Google ADK (Agent Development Kit) and the Gemini Live API. Features real-time bidirectional voice streaming, long-term memory across sessions, a file manager with floating editor windows, user-extensible agent skills, image generation/editing, and Firebase-backed storage -- all deployed to Google Cloud.

Architecture

Browser (React)                        Cloud Run (FastAPI + ADK)
+---------------------------+          +------------------------------------+
| File Manager (SVAR)       |          |  WebSocket Handler                 |
| Floating Windows (WinBox) |          |  ADK Runner (run_live)             |
| CodeMirror 6 Editor       |          |                                    |
| PDF Viewer (react-pdf)    |          |  Voice Agent                       |
| WebGL Orb (72px, center)  |--WS----->|  gemini-live-2.5-flash-native-audio|
| Audio I/O (16kHz/24kHz)   |<-WS------|  + PreloadMemoryTool (long-term)   |
| Firestore Realtime Sync   |          |  + SkillToolset (user skills)      |
+---------------------------+          |  + File Tools (13 functions)        |
        |                              |  + Image Tools (generate + edit)   |
        | REST (server fns)            |  + AgentEngineSandboxCodeExecutor  |
        v                              +-------------|----------------------+
+---------------------------+                        |
| TanStack Start (Nitro)    |          +-------------|----------------------+
| Server Functions (proxy)  |--REST--->| Firebase Storage    | Firestore   |
+---------------------------+          | gs://bucket/users/  | users/{uid} |
                                       | {uid}/{path...}     | /files/{id} |
                                       +------------------------------------+

See ARCHITECTURE.md for the full system design.

ADK Features Used

Feature	How It's Used
Gemini Live API (run_live)	Bidirectional audio streaming with native voice model
PreloadMemoryTool	Loads relevant memories from past sessions at the start of each turn
VertexAiMemoryBankService	Persists session conversations to long-term memory on turns
SkillToolset	User-defined skills loaded from Firebase Storage at session start
AgentEngineSandboxCodeExecutor	Sandboxed execution of skill scripts (.py, .sh)
FunctionTool (closures)	13 file/image tools (list, read, write, create, delete, rename, move, copy, search, info, generate_image, edit_image)
VertexAiSessionService	Per-connection sessions backed by Agent Engine
SessionResumptionConfig	ADK auto-handles ~10min Live API connection timeouts transparently
ContextWindowCompressionConfig	Unlimited session duration (trigger at 100k tokens, compress to 80k)
RunConfig	BIDI streaming, audio modality, transcription, resumption, compression
Event filtering	Backend strips tool internals, forwards only user-facing data

Tech Stack

Layer	Technology
Agent Framework	Google ADK (Python)
Voice Model	`gemini-live-2.5-flash-native-audio` (Vertex AI, `us-central1` only)
Image Model	`gemini-2.5-flash-image` (Vertex AI, `us-central1`)
Backend	FastAPI + Uvicorn
Frontend	TanStack Start (React 19) + Bun + Tailwind v4
File Manager	`@svar-ui/react-filemanager` + WillowDark theme
Editor Windows	WinBox.js (custom React wrapper)
Code Editor	CodeMirror 6 + oneDark theme
PDF Viewer	react-pdf
Auth	Firebase Authentication (Google Sign-In)
Storage	Firebase Storage (blobs) + Firestore (metadata, realtime)
Sessions	Vertex AI Agent Engine (Session Service + Memory Bank)
Infrastructure	Pulumi (Python, local state), Docker, Cloud Run
Domains	`asist0.com`, `api.asist0.com`

Project Structure

asisto/
├── asisto_agent/                    # ADK agent package
│   ├── agent.py                     # Voice agent + create_agent() factory
│   └── __init__.py                  #   with PreloadMemoryTool + SkillToolset
├── app/                             # Frontend (TanStack Start + Bun)
│   ├── src/
│   │   ├── lib/
│   │   │   ├── api.ts              # Server functions (file CRUD, workspace, download)
│   │   │   ├── firebase.ts         # Firebase init + Firestore export
│   │   │   ├── useAuth.tsx         # AuthProvider + useAuth hook
│   │   │   ├── useFiles.ts         # Firestore realtime subscription (onSnapshot)
│   │   │   ├── useWorkspace.ts     # Workspace layout save/restore via server fns
│   │   │   ├── useAgentSocket.ts   # Always-on WebSocket (auto-reconnect, exp backoff)
│   │   │   ├── useAudioCapture.ts  # Mic capture (AudioWorklet, 16kHz PCM)
│   │   │   └── useAudioPlayback.ts # Speaker playback (pcm-player, 24kHz)
│   │   ├── components/
│   │   │   ├── Window.tsx          # WinBox.js React wrapper (dynamic import for SSR)
│   │   │   ├── FileViewer.tsx      # CodeMirror + markdown preview + image + PDF viewer
│   │   │   └── Orb.tsx            # WebGL orb (ogl, 72px bottom-center)
│   │   ├── routes/
│   │   │   ├── index.tsx          # Landing / sign-in
│   │   │   └── app/
│   │   │       └── index.tsx      # Main workspace: file manager + floating windows
│   │   └── styles.css             # SVAR vars, Tailwind fixes, WinBox dark theme
│   └── public/
│       └── capture-processor.js   # AudioWorklet (float32 → int16)
├── infra/                          # Pulumi IaC (Python)
│   └── __main__.py                # Cloud Run, Artifact Registry, IAM, domain mappings
├── main.py                         # FastAPI — WebSocket, file REST, workspace REST, auth
├── storage_ops.py                  # Firebase Storage + Firestore CRUD + workspace layout
├── skill_loader.py                 # Reads user skills from Storage, parses SKILL.md
├── agent_tools.py                  # 13 tool closures: file ops + image generation/editing
├── config.yaml                     # Central config (project, region, bucket, engine ID)
├── firebase.json                   # Points to firestore.rules + storage.rules
├── firestore.rules                 # users/{userId}/files + workspace — auth.uid == userId
├── storage.rules                   # users/{userId}/{allPaths=**} — auth.uid == userId
├── Dockerfile                      # Backend container
├── Makefile                        # Build, deploy, dev commands
├── pyproject.toml                  # Python dependencies (uv)
├── ARCHITECTURE.md                 # System design deep dive
├── DEPLOYMENT.md                   # Full deployment guide
└── LICENSE                         # MIT

Quick Start

Prerequisites

Python 3.13+ with uv
Bun
gcloud CLI (authenticated)
Google Cloud project with Vertex AI + Firebase enabled

Local Development

# Install dependencies
uv sync
cd app && bun install && cd ..

# Configure
cp config.yaml.example config.yaml  # Edit with your project details
cp app/.env.example app/.env

# Deploy Firebase security rules
make deploy-rules

# Run backend (8080) + frontend (3000)
make dev

Make Commands

make dev             Run backend + frontend concurrently
make dev-api         Backend only
make dev-app         Frontend only
make deploy-agent    Deploy agent to Agent Engine
make deploy-infra    Deploy to Cloud Run via Pulumi
make deploy-rules    Deploy Firebase security rules
make deploy-all      Deploy agent + infrastructure
make logs            View backend logs
make status          Check deployment status