Visor — Your AI Desktop Guide

Inspiration

Modern computers are powerful—but navigating them isn’t.
Many people, especially beginners, older adults, and non-technical users, struggle with essential tasks like:

Creating accounts
Navigating system settings
Installing software
Managing files

Traditional chatbots only provide text answers.
Tutorial videos aren’t interactive.
And none of these solutions respond to your actual screen.

We wanted something fundamentally different:

An AI that sees your screen, understands your intent, and literally points to what you need to click—in real time.

That became Visor.

What It Does

Visor is an intelligent desktop assistant that visually guides users through any task on their computer.
It works by analyzing screenshots, understanding user intent, and drawing arrows, circles, and tooltips directly on the screen.

Visor has four core features:

1. Real-time Visual Guidance

Visor:

Captures a live desktop screenshot
Sends it (plus the user’s goal) to an LLM via OpenRouter
Receives structured instructions
Draws an on-screen overlay (circle/arrow/box) to highlight what to click

The user follows the guidance, then presses Done to move to the next step.

2. Conversational AI That Understands Tasks

Users can describe tasks naturally, such as:

“Help me find my CompArch folder.”
“How do I change my display resolution?”
“Guide me through creating a Google account.”

Visor interprets the goal and generates a step-by-step workflow dynamically.

3. Automatic Multi-Step Progression

After each user action:

Visor detects UI changes via screenshot differences
Determines the next step automatically
Continues guiding until the task is complete

No manual setup. No pre-scripted workflows.

4. Visual Overlay Engine

A cross-platform floating overlay that:

Sits above all applications
Renders transparent, click-through arrows and highlights
Updates based on screen changes
Never blocks user interactions

How We Built It

Frontend & Overlay

Electron handles desktop packaging, global hotkeys, and multi-window rendering
React + TypeScript power the chatbox and UI
HTML Canvas draws precise shapes and highlights over the screen

Backend Logic

Node.js + Electron IPC for screenshot capture, window control, and message routing
A custom high-resolution screenshot service optimized for minimal latency