VA - Visual Actions

Inspiration

Pittsburgh has always been a city of knowledge and culture — from Andrew Carnegie’s world-famous libraries and museums to the historic steel industry that powered innovation. But even today, interacting with information kiosks in libraries and museums can feel clunky, unhygienic, or inaccessible. We wanted to create a solution that honors Pittsburgh’s tradition of public knowledge by making those digital spaces easier and more inclusive for everyone.

What it does

VA – Visual Actions is a touch-free interface for public information screens. Using only a laptop camera:

Pinch + move your hand to scroll content, like a trackpad in the air.
Swipe with your index finger up to switch between tabs or desktops.
Thumbs up hold to lock the screen.
A cursor tether shows where your hand is, making control intuitive.
No gloves, controllers, or special hardware required — just your hand and the camera.

How we built it

Leveraged MediaPipe Hands to detect and track 21 landmarks in real time.
Implemented three core gestures: pinch for scroll, index-finger swipe for tab switching, and thumbs-up hold for lock.
Mapped gestures to macOS actions through Quartz and AppleScript.
Added smoothing, hysteresis, and momentum shaping so gestures feel stable and natural.
Designed a simple HUD overlay to show status, feedback, and cursor position.

Challenges we ran into

Making scrolling feel smooth and human-like required fine-tuned thresholds and math.
Handling false positives — early versions would trigger accidentally with small hand jitters.
macOS requires strict accessibility permissions for synthetic input events, which slowed development.
We had to cut scope: focused on three reliable gestures instead of trying to support every possible one.

Accomplishments that we're proud of

Delivered a real-time, touch-free controller that actually feels usable.
Developed three gestures that map directly to real-world actions (scroll, switch, lock).
Integrated a cursor tether so users know exactly where they are on screen.
Created a polished demo in under 24 hours — something that feels like it could be deployed in a library or museum.

What we learned

In gesture design, simplicity wins — three reliable actions beat a dozen unreliable ones.
Small UX touches like cursor feedback and smoothing make the difference between “demo tech” and something people actually want to use.
OS integration and permissions can be as challenging as computer vision itself.
Hackathons reward polish, storytelling, and context just as much as technical depth.

What’s next for VA – Visual Actions

Expand to multi-gesture control: zoom gestures, dwell-to-select, and swipe-based navigation in apps.
Build a web-based kiosk interface so museums and libraries can run it directly in a browser.
Add accessibility presets for users with tremors or limited mobility.
Pilot in Pittsburgh’s libraries and museums as a hygienic, inclusive alternative to touchscreens.

⚡️ In short: VA – Visual Actions turns Pittsburgh’s spirit of knowledge and public access into a modern, touch-free experience — keeping libraries and museums both accessible and interactive.

Built With

mediapipe
python

Updates

Vladyslav Bordia started this project — Sep 21, 2025 10:08 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.