Problem

It’s heartbreaking that in 2025, Search and Rescue still depends on a tired human staring at a drone feed, praying they catch the one clue that leads to a missing person. When a child wanders off in Central Park or a flood traps families on rooftops, time doesn’t just matter - it feels like it’s slipping away. Yet the entire operation hinges on a single operator trying to fly, look, think, and react all at once. The drones can handle the chaos; the humans can’t. And those tiny moments of hesitation, the frames we miss, the decisions we make too slowly… those can cost lives we never get back.

Solution

Grok steps in where humans simply can’t keep up. You describe the missing person, share a photo, and Grok turns that into an autonomous mission: the drone launches itself, navigates unpredictable terrain calmly, scans every angle at once, and alerts you the moment it finds someone.

And during large-scale disasters, Grok can sweep entire areas, identify every human it detects, and build a live rescue map that responders can trust. It shifts Search and Rescue from a race against human limits to a system where we finally have technology that can see everything.

Demo Video

https://youtu.be/HTVBVcilBCI

Tech Stack

Frontend

  • Next.js
  • TypeScript
  • Tailwind CSS Backend
  • Flask 3.0
  • Python
  • xAI SDK (Speech-to-Text)
  • OpenCV (video streaming)
  • WebSockets
  • Pydantic

Technical Difficulty

  • Hardest challenge: getting permission to use the drone inside the xAI office
  • Grok Text to Speech (TTS) has a higher latency than an in-browser TTS causing the speech to output much after the action is done being performed by the drone
  • Video stream delays with Grok tool calls: frames can only be sent to Grok so fast, while Grok needs the most up-to-date data to make all the right drone tool calls.
  • We ran into an issue where the drone connection blocked the computer from using Wi-Fi, but we solved it by USB-tethering a phone hotspot.

Built With

Share this project:

Updates