NOOX: An Open-Source AI-Driven Desktop Automation Device
Inspiration
NOOX began with a simple question: LLMs can plan and reason—but why can’t they directly act on our computers?
I wanted a small, open, portable device that could:
- Control a computer via HID
- Run cross-platform shell commands
- Talk to LLMs
- Perform multi-step automation autonomously
This became the foundation for NOOX, an ESP32-S3–based intelligent hardware platform merging peripherals, desktop automation, and AI planning.
What I Built
NOOX integrates:
1. Peripheral Hardware
- USB HID keyboard/mouse emulation
- USB CDC bidirectional channel
- OLED screen + 3 buttons
- GPIO + RGB LED
2. Desktop Automation
A Go-based Host Agent (auto-downloaded via HID) runs shell commands on Windows/Linux/macOS and communicates using JSON over USB CDC.
3. AI Autonomous Planning
With OpenAI/DeepSeek/OpenRouter support, the LLM can:
- Plan multi-step tasks
- Call tools like
run_command,hid_keyboard_type,gpio_set - Iterate until a goal is complete
The system can perform tasks like:
“Create a file, populate it, compress it, and open the folder.”
How I Built It
- Firmware: Arduino framework + ESP-IDF + FreeRTOS
- Storage: LittleFS for configs, web UI, and host agent
- UI: WebSocket-driven dark-theme console (Chat / Advanced Mode)
- Memory: Heavy JSON/LLM handling stored in 8MB PSRAM
- Host Agent: Cross-platform BLOB served from ESP32 and executed automatically
Hardware:
- ESP32-S3-WROOM-1
- SSD1315 OLED (I²C)
- PMOS USB/battery auto-switching
- Buttons + LEDs + WS2812
What I Learned
- How to manage embedded memory to avoid fragmentation
- Designing reliable USB HID sequences and timing
- Crafting a CDC JSON protocol resilient to truncation and slow hosts
- Building a WebSocket-driven UI with real-time streaming
- Structuring LLM tool-calling for predictable autonomous behavior
Key Challenges
- Getting HID scripts to reliably bootstrap PowerShell across systems
- Parsing large LLM JSON responses within ESP32 memory limits
- Handling differences between PowerShell, pwsh, cmd, bash, and sh
- Avoiding WebSocket race conditions with multiple clients
- Ensuring autonomous planning doesn't become unsafe
Security Notes
Because AI can run shell commands and simulate keyboard input, NOOX should never be executed with admin/root privileges. A VM or sandbox is recommended for experimentation.
* Final Thoughts**
NOOX was an exploration of what happens when you combine embedded hardware, desktop automation, and modern LLMs into a single portable device. I learned deeply across hardware, firmware, software, and AI orchestration—and the result is a platform that lets AI literally act on your computer.
Built With
- arduino
- esp32
- freertos
- platformio
Log in or sign up for Devpost to join the conversation.