🖱️ AI Virtual Mouse: Hand-Gesture Cursor Control
📖 Overview
AI Virtual Mouse is a Human-Computer Interaction (HCI) prototype that allows users to control their Windows cursor with sub-millimeter precision using hand gestures.
Unlike standard gesture scripts that suffer from lag and jitter, this project focuses on high-performance stability. It utilizes Coordinate Interpolation for screen mapping and direct Win32 API calls (ctypes) for zero-latency cursor movement, bypassing the overhead of traditional automation libraries like PyAutoGUI.
🚀 Key Features
- ⚡ Zero-Latency Control: Leverages
ctypes.windll.user32to interface directly with the Windows OS for instant cursor response. - 🎯 Active Region Mapping: Implements a "Virtual Trackpad" logic (Coordinate Interpolation) to map a smaller camera region to the full 16:9 screen, ensuring all screen corners are reachable comfortably.
- 🧠 Intelligent Smoothing: Uses an Exponential Moving Average (EMA) algorithm to filter out hand tremors and camera noise for a buttery-smooth experience.
- 🛡️ Focus Mode: Features a dynamic "Black Canvas" UI that isolates hand landmarks and removes visual distractions (privacy-focused).
- 🤏 Ghost Click: Intuitive "Pinch-to-Click" gesture detection using Euclidean distance calculation between index and thumb landmarks.
🛠️ Tech Stack
- Language: Python 3.x
- Computer Vision: OpenCV (
cv2) - Image processing and canvas rendering. - ML Pipeline: Google MediaPipe - High-fidelity hand landmark detection (21 points).
- Math: NumPy - Vector operations and coordinate clamping/interpolation.
- OS Interface: Windows API (
ctypes) - For DPI awareness and hardware-level mouse events. - Development Environment: Cursor AI - Utilized for AI-assisted code optimization, debugging, and automated documentation.
⚙️ Installation & Setup
Clone the Repository
git clone [https://github.com/prakharsaxena230706-hub/AI-Virtual-Mouse.git](https://github.com/prakharsaxena230706-hub/AI-Virtual-Mouse.git) cd AI-Virtual-MouseInstall Dependencies
pip install opencv-python mediapipe numpyRun the Application
python main.py
🎮 Controls
| Gesture | Action | Visual Feedback |
|---|---|---|
| Index Finger Moving | Move Cursor | 🟢 Green Pointer |
| Pinch (Index + Thumb) | Left Click | 🔴 Red Tips + "CLICK!" Text |
| Exit | Close App | Press ESC Key |
🧠 Engineering Challenges Solved
1. The "Unreachable Corner" Problem
Issue: Mapping the camera's 4:3 aspect ratio directly to a 16:9 monitor makes the edges physically hard to reach.
Solution: Implemented np.interp (Linear Interpolation) to map a central "Active Zone" (Frame Margin) to the full screen resolution. This creates a virtual sensitivity multiplier.
2. The Jitter Issue
Issue: Raw landmark data from ML models is noisy, causing the cursor to shake.
Solution: Applied a smoothing factor (damping) to the coordinates:
Current_Pos = Prev_Pos + (Target_Pos - Prev_Pos) / Smoothing_Factor
3. DPI Scaling Mismatch
Issue: Windows High-DPI displays (125% zoom) cause Python to miscalculate screen coordinates.
Solution: Enforced ctypes.windll.shcore.SetProcessDpiAwareness(1) to retrieve the true physical resolution of the monitor.
🤝 Contribution
Contributions are welcome! Feel free to open an issue or submit a pull request for features like "Right Click" gestures or "Scroll" functionality.
Developed by Prakhar Saxena
Log in or sign up for Devpost to join the conversation.