About the project
Inspiration
The inspiration for SkySentinel came from observing the rapid growth of autonomous drone delivery and the inherent chaos that comes with scaling it. We realized that current solutions are either too rigid (pre-planned routes that can't adapt) or too reactive (only avoiding immediate collisions).
We asked ourselves: Can we build an Air Traffic Controller that thinks like a human, but reacts with the speed of a machine?
We wanted to move beyond simple tracking to create an agentic system—one that autonomously senses danger, negotiates complex airspace conflicts in real-time, and communicates its decisions clearly to human operators.
How we built it: The Hybrid Cognitive Architecture
Our key innovation is recognizing that Large Language Models are incredible at strategic reasoning but too slow for life-critical, sub-second safety loops. To solve this, we built a Hybrid Architecture that splits responsibility into two distinct layers.
1. The "Reflex" Layer (Sub-Second Safety) This is our high-speed, deterministic layer running in a Python backend. It processes the real-time telemetry stream from Confluent Cloud Kafka. In every cycle, it calculates the Haversine distance between all pairs of drones to detect imminent collisions.
The distance d between two drones at latitudes φ1, φ2 and longitudes λ1, λ2 is calculated using:
If d < SAFETY_RADIUS, the reflex triggers immediately, issuing a "HOVER" command via WebSocket to pause the threat before any damage can occur.
2. The "Cognitive Brain" Layer (Strategic Optimization) Once the immediate danger is paused, or when a new priority order disrupts existing paths, the "Brain" takes over. We use Google Vertex AI (Gemini Pro) as our omniscient dispatcher.
We feed Gemini a structured snapshot of the entire fleet's state—positions, batteries, destinations, and wind conditions—along with the conflict details. We prompt it to act as a supreme air traffic controller to negotiate a new, non-conflicting set of routes.
The AI's structured JSON output is then parsed to update flight paths dynamically on our Google Maps dashboard, while simultaneously generating a clear, urgent voice broadcast via ElevenLabs.
Challenges we faced
Let's be real, this was hard.
- The Latency Trap: Our initial attempt relied too much on the AI for real-time decisions, leading to simulated crashes while waiting for API responses. This forced us to pivot to the hybrid "Reflex/Brain" model, which was a significant engineering breakthrough.
- Prompt Engineering for JSON: Getting a creative LLM like Gemini to consistently output rigid, parseable JSON for multi-drone flight paths was a massive challenge that required extensive few-shot prompting and constraints.
- State Synchronization: Maintaining a perfectly synchronized state between the high-velocity Kafka stream, the Python backend, and the frontend map visualization was a complex exercise in event-driven architecture.
What we learned
We proved that the future of autonomous systems isn't about choosing between deterministic code and AI, but combining them. We learned how to use Confluent as a central nervous system to decouple speed from intelligence, and how to leverage Google Cloud's ecosystem to build a production-ready, scalable microservice architecture in a single weekend.
Built With
- confluent-cloud-kafka
- docker
- elevenlabs-api
- flask
- gemini-pro
- google-cloud-run
- google-maps
- google-vertex-ai
- python
- socket.io
Log in or sign up for Devpost to join the conversation.