๐ก Inspiration
It was the night before my exam, and I was deep in debugging hell. My workflow looked like this:
- Screenshot VS Code terminal
- Upload to ChatGPT
- Get advice to try something I'd already done
- Run out of tokens
- Repeat until sunrise
I realized the problem wasn't the AIโit was the process. What if the AI could just see my screen instead of me explaining it?
After my exams, I spent 4 months grinding on this idea. Testing combinations. Breaking things. Rebuilding. The result? Octateโan AI that saves me time by actually seeing what I'm working on.
No more screenshots. No more context-pasting. Just press a hotkey and ask.
๐ฏ What It Does
Octate is an AI assistant powered by Gemini 2.5 Flash with custom prompting tuned for:
- ๐ Debugging โ sees your errors, suggests fixes
- ๐ป Coding help โ explains code visually
- ๐ Learning โ understands your workflow context
- ๐ญ Roasting โ optional brutally honest feedback (because bland is boring)
It lives as an invisible overlay on your screen, captures context when you ask questions, and disappears when you need privacy.
Core features:
- Screenshot-based AI assistance with full visual context
- Ghost mode for instant invisibility during screen shares
- Hotkey-driven workflow (zero mouse dependency)
- Free forever (bring your own Gemini API key)
๐ ๏ธ How We Built It
Tech Stack:
- Electron.js โ Cross-platform desktop app framework
- React.js โ UI components and state management
- Tailwind CSS โ Utility-first styling
- Supabase โ Backend, database, and real-time features
- OAuth โ GitHub and Google authentication for seamless login
- Gemini 2.5 Flash API โ Vision-capable AI model
Architecture:
- Persistent overlay window with always-on-top functionality
- IPC (Inter-Process Communication) handlers for global keyboard shortcuts
- Screenshot capture pipeline with automatic context injection
- Click-through regions for non-intrusive user experience
๐ง Challenges We Ran Into
1. Overlay Engineering Building a window that stays on top of everything while remaining responsive was harder than expected:
- Balancing always-on-top behavior with user control
- Implementing click-through buttons (some interactive, some passthrough)
- Preventing the overlay from interfering with other apps
2. Keyboard Shortcut Registration The IPC handler system for global shortcuts required extensive testing:
- Handling edge cases (app minimized, focus stolen, multiple monitors)
- Preventing shortcut conflicts with other applications
- Ensuring shortcuts work consistently across Windows/Mac/Linux
3. Screenshot Timing Capturing the screen before the overlay appears (to avoid capturing itself) required precise timing coordination between the main and renderer processes.
๐ Accomplishments We're Proud Of
โ
Octate exists and works โ 80% of the original vision is now reality
โ
The roast feature โ adds personality and makes debugging less painful
โ
Ghost mode execution โ instant invisibility actually works flawlessly
โ
Free and accessible โ no paywalls, no subscriptions, no gatekeeping
โ
Didn't give up โ navigated complex technical challenges without compromising the core vision
The biggest win? Watching Octate solve in 10 seconds what used to take me 10 minutes of screenshot-upload-explain cycles.
๐ What We Learned
Technical:
- Electron's IPC communication patterns for complex desktop apps
- Balancing overlay UX with system-level window management
- Integrating vision-capable AI models with real-time workflows
Personal:
- Persistence matters more than perfection
- Complex stacks and edge cases don't define youโyour solutions do
- Sometimes the best features come from your own frustrations
Philosophy: Your project reflects who you are. When the code gets complicated and the bugs feel endless, the choice to keep building anyway is what separates ideas from shipped products.
๐ What's Next for Octate
๐๏ธ Voice Recognition & Context Memory
The next major feature: voice-driven assistance with conversational context.
How it works:
- Transcribe conversations happening on-screen and off-screen
- Understand spoken debugging sessions during pair programming
- Store conversational context in the database
- Self-train on user patterns for increasingly personalized assistance
Use cases:
- "Hey Octate, what did we just discuss about that API endpoint?"
- Transcribe whiteboard sessions during team calls
- Build a searchable knowledge base from your own debugging conversations
๐ฎ Other Planned Features
- ๐ Plugin system for custom workflows
- ๐พ Offline mode with local AI models
- ๐ Multi-language support for international codebases
- ๐ Context memory across sessions
Built With
- electronjs
- gemini
- react
- tailwindcss
- typescript
Log in or sign up for Devpost to join the conversation.