What it does
Nametag AR detects and remembers the names of people you meet.
The app listens in the background for when people introduce themselves to you. When it detects an introduction (like "Hi, I'm Jake"), it saves the person's name and an image of their face.
Once you've met somebody, Nametag AR will remember them for you. Whenever you see them again in the future, the app will overlay their name near their face so you don't forget who they are.
How we built it
Nametag AR has four main pieces:
A speech processor built with Apple's Speech framework. Continuously listens for the "Hi, I'm ___" keywords (plus other common variations).
A face detector built with Apple's Vision framework. Analyzes an inbound video stream to identify individual faces and crop them in to standalone images.
A facial match system built with Microsoft Azure's Face API and a Node.js server. Builds a repository of the faces you've seen so far compares new faces against that repository.
An augmented reality scene which renders informational overlays (most importantly, the names of people on screen) on top of your inbound video stream.
Challenges we ran into
We were originally using the python openface tool to convert faces into higher-dimentional vector representations. Theoretically, you can find the similarity of two faces by taking the euclidian distance of their vectors. In practice, though, we found this system very unreliable and noisy. The tool also suffered from long processing times, which hampered the augmented reality experience.
With about eight hours left in the hackathon, we pivoted away from openface and started using Microsoft Azure's Face API instead. We're really happy with this change, because it really increased the performance and the accuracy of our facial match system.