Inspiration
I wanted to dive deep into a graphics API at my first hackathon. This is something that would push me technically, not just string APIs together, to learn what devs using this in there day to day work like. The idea of a persistent AI companion that lives on your screen felt like the perfect excuse to build a real-time 3D rendering engine from scratch.
What it does
Summon is a floating macOS overlay, a 3D robot that sits on top of whatever you're working on. Speak to it and it listens via Apple Speech. It can also see your screen using OCR, so it understands the context of what you're doing. Responses are spoken back using ElevenLabs voice synthesis, with the robot's eyes pulsing green in sync with its speech. Three modes: Reactive (responds when you speak), Proactive (chimes in on its own), and Hybrid.
How I built it
Custom Metal rendering pipeline (no SceneKit, no RealityKit) with ModelIO for USDZ mesh extraction. The transparent always-on-top window is AppKit. Voice input uses Apple's Speech framework, screen context comes from ScreenCaptureKit + Vision OCR, the AI brain is Claude, and voice output is ElevenLabs. All wired together through a central VoiceCompanionCoordinator.
Challenges I ran into
The Metal pipeline consumed most of the 36 hours. Getting vertex descriptors, buffer layouts, UV coordinates, and PBR shaders all working together was brutal — at one point the model was rendering as an exploded foil ball. Spent several hours isolating whether the bug was in the mesh loader, the shader, or the texture extraction from the USDZ zip format. Ended up switching character models entirely (cat → robot) to escape a dead-end geometry issue.
Accomplishments that I'm proud of
Shipped a working real-time 3D rendering engine in a weekend, no game engine, no helper frameworks. The PBR shader, the speech-synced eye glow, and the screen-aware AI context all came together in the last few hours.
What I learned
How Metal actually works at the buffer level. How USDZ packages textures and why silent fallbacks make PBR debugging painful. When to cut your losses and switch assets instead of fighting a geometry problem for hours.
What's next for Summon
Skinned animation so the robot moves, not just glows. A plugin system so it can take actions (open apps, write code, draft messages). Possibly a menu-bar mode for less screen real estate.
Log in or sign up for Devpost to join the conversation.