🖱️ AI Virtual Mouse: Hand-Gesture Cursor Control

Python OpenCV MediaPipe Cursor Platform

📖 Overview

AI Virtual Mouse is a Human-Computer Interaction (HCI) prototype that allows users to control their Windows cursor with sub-millimeter precision using hand gestures.

Unlike standard gesture scripts that suffer from lag and jitter, this project focuses on high-performance stability. It utilizes Coordinate Interpolation for screen mapping and direct Win32 API calls (ctypes) for zero-latency cursor movement, bypassing the overhead of traditional automation libraries like PyAutoGUI.

🚀 Key Features

  • ⚡ Zero-Latency Control: Leverages ctypes.windll.user32 to interface directly with the Windows OS for instant cursor response.
  • 🎯 Active Region Mapping: Implements a "Virtual Trackpad" logic (Coordinate Interpolation) to map a smaller camera region to the full 16:9 screen, ensuring all screen corners are reachable comfortably.
  • 🧠 Intelligent Smoothing: Uses an Exponential Moving Average (EMA) algorithm to filter out hand tremors and camera noise for a buttery-smooth experience.
  • 🛡️ Focus Mode: Features a dynamic "Black Canvas" UI that isolates hand landmarks and removes visual distractions (privacy-focused).
  • 🤏 Ghost Click: Intuitive "Pinch-to-Click" gesture detection using Euclidean distance calculation between index and thumb landmarks.

🛠️ Tech Stack

  • Language: Python 3.x
  • Computer Vision: OpenCV (cv2) - Image processing and canvas rendering.
  • ML Pipeline: Google MediaPipe - High-fidelity hand landmark detection (21 points).
  • Math: NumPy - Vector operations and coordinate clamping/interpolation.
  • OS Interface: Windows API (ctypes) - For DPI awareness and hardware-level mouse events.
  • Development Environment: Cursor AI - Utilized for AI-assisted code optimization, debugging, and automated documentation.

⚙️ Installation & Setup

  1. Clone the Repository

    git clone [https://github.com/prakharsaxena230706-hub/AI-Virtual-Mouse.git](https://github.com/prakharsaxena230706-hub/AI-Virtual-Mouse.git)
    cd AI-Virtual-Mouse
    
  2. Install Dependencies

    pip install opencv-python mediapipe numpy
    
  3. Run the Application

    python main.py
    

🎮 Controls

Gesture Action Visual Feedback
Index Finger Moving Move Cursor 🟢 Green Pointer
Pinch (Index + Thumb) Left Click 🔴 Red Tips + "CLICK!" Text
Exit Close App Press ESC Key

🧠 Engineering Challenges Solved

1. The "Unreachable Corner" Problem

Issue: Mapping the camera's 4:3 aspect ratio directly to a 16:9 monitor makes the edges physically hard to reach. Solution: Implemented np.interp (Linear Interpolation) to map a central "Active Zone" (Frame Margin) to the full screen resolution. This creates a virtual sensitivity multiplier.

2. The Jitter Issue

Issue: Raw landmark data from ML models is noisy, causing the cursor to shake. Solution: Applied a smoothing factor (damping) to the coordinates: Current_Pos = Prev_Pos + (Target_Pos - Prev_Pos) / Smoothing_Factor

3. DPI Scaling Mismatch

Issue: Windows High-DPI displays (125% zoom) cause Python to miscalculate screen coordinates. Solution: Enforced ctypes.windll.shcore.SetProcessDpiAwareness(1) to retrieve the true physical resolution of the monitor.

🤝 Contribution

Contributions are welcome! Feel free to open an issue or submit a pull request for features like "Right Click" gestures or "Scroll" functionality.


Developed by Prakhar Saxena

Built With

Share this project:

Updates