The Hardware Parts

AI-Powered Gimbal System for Speaker-Centric Video Recording

Inspiration

We were frustrated watching shaky, poorly framed conference videos where speakers moved out of focus. Professional camera operators are expensive, and existing automated solutions often rely on bulky equipment or cloud processing. We wanted to democratize high-quality video recording by creating an affordable, edge-based AI system that anyone could use.

What it does

Our AI-powered gimbal automatically detects and tracks speakers' faces in real-time using computer vision, smoothly adjusting the camera's pan-tilt movements to keep subjects perfectly centered. It works offline, requires no expensive hardware, and delivers results comparable to professional setups.

How we built it

Hardware

Vision System: OV7670 camera module
Controller: Arduino Mega 2560
Actuation: MG996R servo motors with custom pan-tilt bracket
Power: 5V USB power bank (4+ hour runtime)

Software/AI

Face Detection: OpenCV with Haar Cascades (Python)
Motor Control: PID algorithm (C++/Arduino)
Communication: Serial protocol between Python and Arduino

Key Innovations

Edge-based processing (no cloud dependency)
Hybrid software/hardware PID control for ultra-smooth movements
Dynamic threshold adjustment for varying lighting conditions

Challenges we ran into

Challenge	Solution
Face detection in backlight	Implemented adaptive brightness normalization
Servo jitter	Added PID control with Kalman filtering
Latency >300ms	Optimized serial communication protocol
False positives	Added face-size validation and motion smoothing