USF RapidCam – Real-Time Fall and Emergency Detection

Built by a freshman at the University of South Florida (USF)

About the Project Inspiration

Most safety camera systems report incidents only after they occur. I wanted to build a system that could evaluate emergencies as they happen, detect when someone needs help, and escalate automatically without waiting for human review. The goal was to explore how AI, computer vision, and voice interfaces can be combined to improve campus safety.

What I Learned

How to design and coordinate multiple ROS2 nodes for vision, audio, escalation logic, and system monitoring.

How to set up hardware on the Jetson Orin Nano, including CUDA acceleration and GStreamer CSI pipelines.

How to securely manage secret keys, avoid committing .env files, and work around GitHub push protection.

How AI systems can be used to provide real safety benefits.

Debugging real-time, multi-threaded systems where audio, video, and LLM inference must run together without delays.

How I Built It

Implemented pose detection with YOLO11s-pose running on the Jetson Orin Nano GPU.

Built a VoiceAlertNode that listens for phrases such as “help” or “I am fine” to confirm the situation.

Created an LLM Node using Gemini 2.5 to manage escalation dialogues and monitor user responsiveness.

Integrated Twilio for automated emergency phone calls triggered by ROS2 events.

Used ROS2 Humble to orchestrate communication between all nodes.

Implemented a live video stream using a GStreamer CSI pipeline.

Challenges I Faced

Audio device routing, microphone selection, and system configuration on the Jetson Orin Nano.

Synchronizing communication between multiple ROS2 nodes with different timing constraints.

Ensuring Twilio integration remained secure and compatible with push-protection requirements.

Balancing real-time performance across vision, audio, and LLM processes while keeping the system responsive.

How It Works

  1. Vision Detection

The camera feed runs through a GStreamer CSI pipeline into YOLO11s-pose on the Jetson Orin Nano. The pose model produces keypoints and detects abnormal posture patterns such as falls. When a fall is detected, the system publishes a rapidcam/person_down event to ROS2.

  1. Audio Response Verification

The VoiceAlertNode activates when a fall event is received. It asks the user if they need help. If the user says:

“Help” → the system immediately escalates.

“I am fine” → the alert is cancelled. If there is no response, a countdown begins.

  1. LLM Escalation Logic

The LLM Node uses Gemini 2.5 to check for verbal intent, analyze user responses, and handle escalation dialogue. If the user remains unresponsive or indicates distress, the node publishes rapidcam/escalate.

  1. Emergency Calling

The CallNode uses Twilio to automatically call a preset emergency contact number. All nodes operate independently but communicate through ROS2 topics to maintain reliability even if one node fails.

  1. Live Monitoring

A lightweight Flask server exposes an MJPEG stream for monitoring the system in action. This can be used on a local dashboard or web interface.

Technical Architecture +------------------------------+ | Jetson Orin Nano | +------------------------------+

[Camera Input - CSI] | v +------------------------------+ | GStreamer CSI Pipeline | +------------------------------+ | v +------------------------------+ | YOLO11s-pose (GPU) | | Publishes: rapidcam/person_down +------------------------------+

          |
          v

+------------------------------+ +------------------------------+ | VoiceAlertNode | <------> | LLM Node (Gemini 2.5) | | Listens for speech | | Escalation logic | | "help" / "I am fine" | | Dialogue + countdown | +------------------------------+ +------------------------------+ | v +------------------------------+ | CallNode (Twilio) | | Publishes phone call alerts | +------------------------------+

          |
          v

+------------------------------+ | Monitoring Server (Flask) | | Live MJPEG video stream | +------------------------------+

Future Improvements Improved Vision Model

Upgrade to higher-accuracy pose models or integrate temporal fall detection to reduce false positives.

Better Audio Understanding

Incorporate local speech-to-text or on-device keyword spotting to handle noisy environments.

Multi-Camera Support

Expand the system to handle multiple camera feeds through a single ROS2 network.

Secure Remote Dashboard

Add authenticated web controls, event logs, and live system status indicators.

Hardware Integration

Add LED indicators, sirens, or haptic feedback for accessibility.

Explore battery-powered or portable versions for wider use cases.

Built With

Share this project:

Updates