Aura | Devpost

Inspiration

Conversations move fast, and a lot of communication happens through subtle reactions that are easy to miss: hesitation, confusion, discomfort, skepticism, or the moment someone wants to speak.

For many, especially people who struggle with reading facial cues in real time, that gap can turn everyday conversations into something stressful and unpredictable. Missing those signals can lead to interrupting unintentionally, overexplaining, using the wrong tone, or not noticing when someone is uncomfortable.

We built Aura because we wanted to create something that actively supports conversation in the moment.

What it does

Aura is a real-time meeting coach that stays in the corner of your screen during a conversation.

It analyzes the other person’s visible reactions and gives live, practical coaching such as:

what to change in your delivery right now
what you should say next
when to simplify, pause, clarify, or lower pressure
when a phrase or topic previously caused visible discomfort

Aura also includes an optional demo overlay that lets users see what the system is tracking on screen, including the selected face, landmarks, and live coaching output.

How we built it

We used Opennote to collaborate together and brainstorm ideas

We built Aura as a macOS floating overlay app using SwiftUI.

The system combines several layers:

live screen capture to detect faces during online meetings
Apple Vision landmarks to track eyes, brows, lips, gaze direction, head pose, and facial geometry
a local emotion classification model to add a fast affect prior on the selected face
temporal smoothing and cue fusion so the app does not overreact to single frames
live speech transcription so the advice is grounded in what the user is actually saying
DeepSeek through Featherless to generate exact next-line coaching in real time
a conversation memory layer that tracks phrasing which previously caused visible discomfort and warns before repeating it

Challenges we ran into

The hardest problem was accuracy.

Facial reaction detection in real meeting footage is messy:

faces are small
lighting is inconsistent
gallery view makes tracking harder
subtle expressions are easy to overread
single-frame emotion predictions can be noisy and misleading

We also ran into product challenges:

keeping the overlay unobtrusive while still making it impressive enough for a demo
making the AI advice useful instead of generic
preventing the system from changing its recommendation too fast
avoiding unsafe design choices like inferring sensitive identity traits from reactions

A lot of the work ended up being calibration, smoothing, fallback behavior, and deciding when not to be overly confident.

Accomplishments that we're proud of

We are proud that Aura goes beyond a chatbot and works in a real-world setting.

Some things we are especially proud of:

building a live floating meeting assistant instead of a static demo
combining vision, speech, local inference, and LLM-generated coaching into one system
making the core UI practical: the main output is what to change
adding conversation-memory warnings so Aura can learn what phrasing creates friction
using Featherless in a meaningful way for real-time next-line generation

Most importantly, we turned a vague idea into something that actually helps during a live interaction.

What we learned

We learned that building socially aware AI is much harder than just attaching an LLM to a webcam feed.

A few major takeaways:

raw emotion classification is not enough; practical coaching matters more
timing and stability are as important as model quality
subtle UI decisions determine whether the product feels helpful or distracting
confidence and uncertainty need to be shown honestly
safety matters a lot when building systems that interpret human behavior

What's next for Aura

Our next steps are focused on making Aura more accurate, more personalized, and more useful in real conversations.

We want to:

improve reaction accuracy with better local affect models and stronger cue fusion
support longer-term per-person communication memory
better understand both sides of the conversation, not just the user’s speech
make the coaching more personalized to different meeting contexts like interviews, classes, and team meetings
expand accessibility settings for different sensory and communication preferences
continue refining the unobtrusive mode so it feels natural to use every day

Long term, we see Aura as a real-time communication support layer that helps users navigate conversations with more confidence and less guesswork.

Built With

coreml
featherless
huggingface
opennote
swift

Submitted to

BISV Hacks 2026

Created by

I wrote all of the code and worked on the presentation. I also recorded the demo video.

araeyn Anumula
I helped to suggest ideas and worked on the presentation.

daftencloud Li
Matt W.
Yuchen Wei
Leo Shi

Updates

araeyn Anumula started this project — Mar 08, 2026 08:28 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.