Inspiration

The LA wildfires opened our eyes to a serious challenge facing firefighters. 77% of firefighter fatalities occur from disorientation during rapid breaches of enclosed structures.. When these first responders enter smoke-filled buildings, they often can't see more than a few feet ahead - making an already dangerous job even riskier. That's what inspired us to create Theia, a technology that lets firefighters detect people through walls, even in zero-visibility conditions. By giving them this "sixth sense," we're helping ensure both firefighters and the people they're trying to save make it home safely. The project name "Theia" was named after the Greek goddess of sight and vision.

The Vision

Our aim was to develop an extremely low-cost, real-time human detection system that utilizes commodity ESP32 microcontrollers & cutting-edge AI to provide firefighters with "through-wall" vision via a mixed-reality headset. Imagine navigating a burning building, then seeing which walls have trapped individuals behind them, & which walls are in front of empty rooms — all overlaid onto your vision. That's Theia.

How It Works: From ESP32 to the Mixed Reality headset

Our first task was to interpret Channel State Information (CSI) data, the fingerprint of WiFi signals. We transformed inexpensive ESP32s into a rudimentary radio-frequency sensing system:

  • WiFi Radar (ESP32s): Two ESP32s act as a radar system. One (CSI-TX) transmits WiFi packets, and the other (CSI-RX) captures the resulting CSI data, containing unique signal reflections caused by objects, (or humans) even through walls. We pushed the ESP32 CSI Tool framework to its limits, optimizing sampling configurations for real-time data.

  • Real-time Data Pipeline: We implemented a custom data pipeline to stream the raw CSI data from the ESP32 to an NVIDIA Jetson Nano (2GB). The system was configured to capture CSI measurements across 30 subcarriers at a sampling rate of 100 Hz (10ms intervals), with each capture window containing 50 packets. While this configuration allowed for detailed channel state monitoring, it presented challenges related to baud rate limits and data integrity, particularly given the volume of data being transferred (30 subcarriers × 50 packets × 100 captures/second).

  • AI-Powered Human Detection (Jetson Nano): Once the data was read, we deployed a custom-built, three-block Convolutional Neural Network (CNN) trained on an NVIDIA Jetson Nano. This CNN processes the complex CSI patterns in real-time to detect human presence (binary classification), provide relative (x, y) coordinates of detected individuals, and remain robust despite noise and interference thanks (via dropout and batch normilization).

  • Real-time Processing Pipeline (Jetson Nano): The Jetson Nano connects to our Windows laptop via USB and runs a lightweight WebSocket server to stream processed detection data. This server pushes coordinates and confidence scores from our CNN model to Unity in real-time, maintaining a persistent connection for immediate updates.

  • Mixed Reality Visualization (Unity & Meta Quest): Our Unity application connects to the local WebSocket server to receive continuous updates, allowing us to transform this data into a spatial overlay. Using the Meta Quest's inside-out tracking, the application renders detected human positions as simple green indicators in 3D space, allowing firefighters to maintain awareness of detected individuals through walls and obstacles while minimizing other potential UX distractions in high-stress scenarios.

Challenges We Faced

Since this was a relatively novel concept, where people who implemented this idea used sensors that cost over $400, we used ESP32 microcontrollers which cost a little under $5 each.

In terms of CSI reliability, our initial testing showed a lot of noise and inconsistency in the ESP32's CSI readings. As a result, we optimized the sampling configuration with through lots of testing to find the optimal balance between sampling rate, packet window size, and subcarrier count to maintain signal quality while meeting the real-time demands.

One of the biggest challenges was trying to figure out how to interpret the CSI data that was being fed into the terminal at blazing speed into our CSV files, which would then be fed into our CNN. Also, trying to deploy our CNN model on the contrained 2GB Jetson Nano required us to optimize heavily. We initially implemented batch normalization and dropout layers to handle noisy CSI data and achieving under 100ms in detection latency.

We also struggled to get Unity to directly consume the detection data stream from the Jetson Nano. Instead of wasting house trying to wrestle with Unity's native data handling limitations, we implemented a complete WebSocket server on the Jetson Nano. Although this added some architectural complexity, it allowed us to finally receive real-time updates through established WebSocket protocols.

Finally, getting accurate training data that well-reflected a good range of situations proved to be quite difficult, as the settings we tested and gathered data from were in relatively stable and quiet environments.

Accomplishments We're Proud Of

  • Our teammates. We worked tirelessly (especially River, since he stayed up for 36 hours straight) to get this project come together. We knew this was going to be a huge task if we wanted all the pieces to come together, but it happened. Real-time Through-Wall Detection. We are so proud that we are able to detect human presence through solid barriers using cheap hardware and advanced AI. Even better, we achieved ~90% model detection accuracy within our 36-hour period.
  • Rapid Prototyping Strategy. Being able to deliver a working prototype in 2.5 days required grit, planning, and lots of caffeine :)
  • End-to-End VR Integration. Building a complete VR/AR visualization system that transforms complex CSI data into an intuitive spatial interface for first responders.
  • Pushing the capabilities of ESP32. Optimizing the ESP32 CSI Tool framework beyond typical use cases, and being able to achieve reliable real-time sampling across 30 subcarriers at 100Hz with 50-packet windows was just mindblowing to us. We pat ourselves on the back.
  • Most importantly, tackling a real-world problem. We developed a proof-of-concept system that demonstrates how edge AI and mixed reality could revolutionize search-and-rescue operations, potentially saving lives while protecting first responders.

What We Learned

This project was so so so so complex and we learned so much. We learned a lot about how to approach hooking up different OS and how their minor differences can impact so much in our code. We also learned how to troubleshoot and limit our vast imaginations to achievable constraints given the timed deadlines. We learned how to work with CSI and the complexities of RF sensing. We learned how to build a reliable system for processing CSI data streams taught us crucial lessons about buffer management, sampling rates, and maintaining data integrity across hardware constraints. We learned how to build neural networks on resource-constrained devices like the Jetson Nano 2GB and finding the intricate balance between model complexity, inference speed, and detection accuracy. We learned how to coordinate between ESP32s, Jetson Nano, and Meta Quest in cross-platform development and learned the importance of robust communication protocols.

What's Next for Theia

There's lots of things we can improve on, including figuring out how to implement more sophisticated signal processing techniques to improve detection range (though this would probably require us to buy the SDRs that cost around $400). We could definitely gather more diverse training data from real-world scenarios to improve our model's robustness. Also, we would like to improve our abilities to store information statistics, as well as potential health monitoring systems. Finally, we would also love to integrate the process into a smaller form factor that can be placed onto smaller-scale equipment for easier use and access.

Key Technologies Used

  • ESP32 Microcontrollers: Low-cost WiFi radio and signal processing

  • ESP-IDF (Espressif IoT Development Framework): For low-level control and configuration of the ESP32s.

  • Jupyter Notebook: Model Training, Rapid development and experimentation on the Jetson Nano.

  • Python, TensorFlow/Keras: Training and deployment of the deep learning model. Utilizes CUDA libraries on the Jetson Nano for GPU acceleration.

  • CUDA Libraries: Acceleration of Tensorflow processing on the Jetson Nano.

  • NVIDIA Jetson Nano (2GB): Edge computing for real-time inference.

  • C# (Unity): VR/AR application development and scene creation.

  • Meta Quest (Target Deployment): Augmented Reality headset for first responders.

  • TCP/IP: Custom protocol for data transmission between Jetson, server, and Unity.

Built With

Share this project:

Updates