While phones are becoming more advanced and we seemingly have the world at our fingertips, they can still seem "clunky". Whether it's a problematic network connection or the sheer volume of files, a phone's small screen isn't enough to efficiently organize and parse all the information during big conferences or sales events. With InSights AR, we hope to integrate the information directly into one's line of sight, allowing us to connect to peers or customers with a new, informed perspective.

What it does

InSights AR uses OpenCV for facial recognition and applies it to a real-time video streamed from the Magic Leap headset to recognize people and attach supplemental information.

How we built it

We used the Magic Leap AR headset and connected it to an Azure server. We used OpenCV and trained models with pictures of faces (obtained with the subject's permission), then sent the classified figures back to the headset to be overlayed on reality. Additionally, we added eye-tracking so the user only sees information for the people they are looking at. This prevents the user from being bombarded with information in a noisy environment. Finally, we integrated a filtering algorithm with a geometrically weighted majority vote system over a buffer history to prevent the classification from wavering frame-to-frame.

Challenges we ran into

Synchronizing the Magic Leap and laptop for video streaming was a challenge as developing solely on Magic Leap is difficult and not collaboration-friendly. Additionally, we had difficulty balancing video quality and speed. We needed a resolution high enough to recognize faces under different lighting and angle, but low enough to stream and process without lag. Finally, we had difficulty keeping the classification consistent while streaming the video.

Accomplishments that we are proud of

We were able to recognize faces in real-time and label them with additional information, and we were able to keep the classification stable with multiple subjects moving around. We used eye-tracking to create an intuitive user interface that is parallel to reality.

What we learned

Augmented reality and networking are hard!! Fortunately, we were able to get these moving parts to work together with additional features that would be applicable scenarios of all kinds.

What's next for InSights AR

If we had another day, we would display more detailed, contextually appropriate information. For example, when two people stand next to each other, we would display things in common such as interests, hobbies, or previous research projects. Currently, our database is limited to a small database of users. In the future, we hope to add profiles in real-time.

If we were given another 3 months, we would have worked emotional analysis. Using Microsoft's existing library Video Insights and Video Indexer, we would be able to analyze the emotions of the subject. An application would be if people didn't give photo permission, the headset would filter out their faces and display only their emotions, perhaps as an emoticon, as to not hinder conversation.

Another AR application would be adding translated subtitles to facilitate smooth conversations for multicultural events. An overarching, bigger goal is to make the video streaming more efficient with a faster frame rate. It would be best if we didn't have to sacrifice resolution for portability.

Share this project: