AiRGestures

Screenshot of main app UI

Inspiration

Working on our own university coursework, after spending hours staring at broken camera feeds, we started wondering if we could make use of the data captured in another way. The idea of AirGestures arose out of this.

What it does

The platform consists of an iOS application and a MacOS server, with a USB cable connecting the two devices. The user can hold their hand over the phone and use a set of gestures in the air to manipulate a virtual ball, which in turn causes actions to be triggered on their computer, such as navigating through slideshows, or triggering music controls.

How we built it

Pretty much the entire codebase was written in Swift4. We used Microsoft's Cognitive Services Custom Vision API to generate a CoreML model from training data, then we query the camera feed against the model and detect the gesture the user is making. We use the peertalk protocol to communicate data about the gesture to the client, where it triggers relevant OS functions. We defined a series of commands that could be representative of some of many possible applications of our technology.

Challenges we ran into

We struggled with our CV model a lot. This was mainly solved by more training data, but we eventually came to the issue of trying to do object detection and trying to track the user's hand across the screen. Unsatisfied with libraries we found online, we decided to write our own object detection library from scratch by taking a background image and subtracting it from an image with the object in it. This actually worked, but we found it was too processor intensive to run effectively on device.

Another struggle was getting the iOS and MacOS devices to talk to each other. At first we used the Apple Multipeer Connectivity library for this, but after facing connection issues we discovered peertalk, which gave us the quick, ease of use we desired, with the added reliability of transmitting data over USB rather than wirelessly.

Accomplishments that we're proud of

We created comprehensive machine learning models in less than 24 hours
Building a beautiful UI that is intuitive and has a lot of affordance.

What we learned

2 of the team had never worked in Swift/iOS/MacOS before
How to use the Cognitive Services Custom Vision API to generate ML models
Cross platform connectivity protocols

What's next for AiRGestures

The product could be expanded on to any number of applications by changing the OS functions called by the server, and by adding more gestures to the app. We will be open sourcing the project and providing an easy to use API to allow anyone to add their own functionality.

Built With

azure
cognitive-services
coreml
css
html
ios
macos
objective-c
swift
visionapi

Submitted to

IC Hack 18

Created by

This was my first time ever working in iOS, opting for android in all of my mobile efforts in the past. I helped build the ML model and then built the entire UI of the app, as well as building the communication system to interact with the MacOS client.

Lewis Bell
I worked on the graphical user interface for the macOS client, and then on analysing images on the iOS end, conducting extensive research into how we could further develop the available gestures to read more than just the current hand shape.

Charlie Harding
I worked mainly on the iOS application, integrating and training the machine learning models for prediction of objects within the frames. I was able to gain new insight into the Microsoft Cognitive Vision API and utilise this alongside my existing iOS knowledge. It was really great to see how easy it was to use these tools to create a bespoke vision model and integrate it within a matter of hours!

Jay Lees
I primarily worked on building the macOS "server" that talked to the iOS client. The result is an app that lets people use gestures in the air to control their mac, it's pretty cool!

Brendon Warwick

Updates

Lewis Bell started this project — Jan 28, 2018 06:12 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.