Inspiration
When a disaster like a structural fire or building collapse happens, every single second counts. First responders often have to enter these critical zones completely blind. I was inspired to build an autonomous system that could act as a "first set of eyes and ears" on the ground, using AI to instantly secure disaster zones and communicate with anyone trapped inside before human rescue teams arrive.
What it does
Echo-Guardian is an autonomous threat-detection and warning system. When drone footage or security camera feeds are uploaded, the system routes the media through a high-performance computer vision pipeline. It uses a YOLOv8 neural network to instantly scan the frames for survivors, vehicles, or hazards. Once a threat is visually confirmed, the backend dynamically generates an emergency text script and triggers a localized, authoritative voice warning to instruct personnel to evacuate the immediate zone.
How I built it
I built the entire full-stack application solo. The backend is a high-performance Python FastAPI server running on an Ubuntu WSL environment, utilizing my RTX 4060 GPU. For the Machine Learning vision, I integrated Ultralytics YOLOv8 and OpenCV to process raw image bytes and video frames. For the audio response engine, I integrated the ElevenLabs API to generate realistic, multilingual voice warnings, which are encoded into Base64 and sent to the frontend. The frontend is a clean, dark-mode terminal UI built from scratch using HTML, CSS, and vanilla JavaScript.
Challenges I ran into
As a solo developer, I ran into the classic hackathon "dependency hell." I had to navigate severe versioning conflicts between the newly released PyTorch 2.6, NumPy 2.0, and OpenCV. At the 11th hour, I also hit an API authentication rate limit with ElevenLabs. Instead of giving up, I engineered an "Unbreakable API" shield using a custom try...except fallback system. If the external audio API fails, the backend gracefully catches the error, and the JavaScript frontend dynamically hijacks the browser's native text-to-speech engine to ensure the warning is always broadcasted.
Accomplishments that I'm proud of
I am incredibly proud of building a fully functioning, end-to-end AI pipeline from scratch in such a short amount of time. Writing a secure FastAPI backend that can seamlessly decode base64 images, process them through a heavy neural network, draw bounding boxes in real-time, and return a dynamic audio payload to a custom UI is a massive leap forward for my engineering portfolio.
What I learned
I leveled up my backend engineering skills significantly. I learned how to manage complex Python environments in Linux (WSL), implement CORS middleware in FastAPI, handle asynchronous file uploads, and encode raw binary video/audio data into Base64 for web browsers. I also learned a lot about fallback architecture and building resilient code that survives external API crashes.
What's next for Echo-Guardian
The immediate next step is dedicated machine learning training. I am preparing to train a custom YOLOv8 model from scratch on a massive, specialized Roboflow dataset for 200 epochs to highly optimize the system specifically for detecting structural fires and smoke from high-altitude drone footage.
Built With
- css
- elevenlabs
- fastapi
- html
- javascript
- opencv
- python
- uvicorn
- wsl
- yolov8

Log in or sign up for Devpost to join the conversation.