Inspiration
Social skills are often a silent skill; nobody teaches them, but they shape nearly every opportunity in life. We were inspired to build ContextLens to support people who feel overwhelmed in social environments, especially introverts and neurodivergent individuals who may process social cues differently. Our goal is simple: shift the social side of life away from stress and towards learning!
What it does
ContextLens is a real-time conversation assistant that provides gentle and optional social-cue support through XR (demoed on Meta Quest 3). It detects changes in speech tone, body language, and environmental context, then surfaces subtle pop-up cues (not commands) to help users navigate conversations with more clarity and confidence.
How we built it
Our product is split into 3 main categories:
- We use Deepgram for a detailed speech transcription (and live translation support)
- We use Gemini Vision as our indicator for body language, facial cues, and environmental context
- We use Gemini as our main thinker by using its large context window to combine everything into usable insights Finally, the project runs on a relatively light web stack (Next.js, React, and Vercel), with clean audio capture using custom programmed SoX.
Challenges we ran into
Our biggest challenge was latency, especially getting real-time analysis fast enough to be useful in live conversation. Of course, the goal is for our vision to be optimized for speed and utility as technology progresses. Another huge challenge was optimizing the Meta Quest 3 to meet our program needs. Unity was one of the development editors we required to configure the goggles, but with some Meta SDK difficulties, we had to pivot to building our program on a website that we then pull up on the Meta Goggles instead. This opened a plethora of UI related issues come our way, as we needed to rely on our UI development rather than utilizing Meta's SDK.
Accomplishments that we're proud of
We’re proud we built a working real-time system that stays subtle and supportive, rather than overwhelming and demanding. Most importantly, we’re proud that ContextLens was built with the notion of empathy and designed for people who often feel left behind in high-pressure social settings. We love what we do, and we do what we do for a reason.
What we learned
Our main point of learning was that emotion and social-cue tools must be designed responsibly with uncertainty, humility, ethics, and user control. There is an incredibly thin line between using data carefully and using it for good, and trying to address those concerns is just as important as wanting to push out a product. Coming to compromises, even though it may affect our features, seemed disappointing at first glance, but in hindsight, it was incredibly needed to address these concerns and stick to our goal of helping people!
What's next for Context Lens for Education
Long-term, our goal is to bring ContextLens to Meta glasses, where it can become a lightweight, everyday-use item! This would be much more subtle, private, and most importantly optional + ethical.
Built With
- ar/vr
- deepgram
- gemini
- meta
- react
- sox
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.