Inspiration
“Pictures are worth a thousand words, but videos are worth a lot more.” - someone really smart
It’s currently peak recruiting season, and interviews are coming left and right for positions starting next summer. While lots of preparation can be done by cramming Leetcode patterns, brushing up on resume points, and rehearsing behavioral questions, non-verbal communication skills are often overlooked, especially in our virtual interactions.
Many of our interviews today are conducted through video calls. They have replaced in-person meetings, and with that, there's been a surge in the importance of virtual communication skills. Cues like posture and eye contact during these virtual meetings can be heavily emphazed due to the confined frame of the camera and an inability to show the rest of someone’s body language.
Imagine a candidate with impeccable technical skills, but during the video interview, they're slouched in their chair, and their gaze is continually drifting. Despite them making 3D dynamic programming look like a piece of cake, they may come off as disinterested, distracted, or lacking confidence. In contrast, a candidate who maintains good posture and steady eye contact exudes confidence, attentiveness, and professionalism. It's a subtle difference but can play a pivotal role in leaving a lasting impression.
Recognizing this gap, IntervU was born. We noticed that while there are ample resources available for individuals to prepare for the content of interviews, very few tools focus on enhancing the non-verbal aspects of communication. IntervU aims to fix this. By providing real-time feedback on posture and eye contact levels during these practice sessions, we enable users to be conscious of, and subsequently improve, these crucial soft skills. By leveraging advanced computer vision and AI algorithms, the webapp can detect deviations in optimal posture and gaze, alerting the user instantly.
What it does
IntervU helps users with their non-verbal cues during interviews. Our computer vision algorithms track your body’s posture as well as eye movements, making sure that you stay as attentive as possible.
How we built it
We built it using Tensorflow, MoveNet, DeepFace, OpenCV, and lots of love <3.
Challenges we ran into
- We wanted to implement Whisper but with the use of OpenCV and Tensorflow, it was a bit too much to incorporate both the audio and video recordings at once. Balancing the two technologies required a deeper understanding of multi-modal processing, and synchronizing them in real-time proved to be a significant challenge.
- Combining all of our computer vision models together to get a more cohesive product was a daunting task. Each model had its own set of requirements and intricacies, which sometimes conflicted with each other, making integration a meticulous process.
- Optimizing the system was very difficult as our models took a bit of time to process our information.
Accomplishments that we're proud of
- Seeing our hard work displayed through the app!
- Producing a project using computer vision and machine learning starting with virtually no background knowledge on the subjects
What we learned
- How to use OpenCV and better understand computer vision in general
- How to train and test a machine learning model with sample data as well as our own data
What's next for IntervU
Implementing Whisper and GPT-4 to create a platform more like Google’s Interview Warmup where IntervU becomes a one-stop shop for interview preparation. Users could be able to record their words and get feedback, strengthening both their verbal and non-verbal communication skills for interviews.
Implementing playback so users can better review their performance. While having these stats are vital, giving users the ability to replay the moments where their eye contact wandered or their posture started to deteriorate helps them better realize why it happened.
Adapt our posture models to cater to individual needs, moving away from the current one-size-fits-all hard-coded approach towards a more personalized system
- Ideally this would be done by using people’s recordings as training information
Log in or sign up for Devpost to join the conversation.