InterView AI

Inspiration

For this product, we were inspired by the number of interview preparation tools we have used in our recruiting experience. Tools like Yoodli and Google's Interview Warmup have been changing the game of interview prep, giving user real-time advice on their interview style. However, there is something these tools are lacking - body language, facial expression, and posture analysis.

What it does

To fix this problem, we created a new tool - InterView AI. InterView AI combines the transcription and audio analysis of existing interview tools with powerful computer vision technology to analyze users' expressions, facial, and body language to provide holistic and more personalized feedback.

How we built it

We built InterView AI with a number of different technologies and libraries. We used Next.js for the frontend and FastAPI for the backend server. We then used Google's MediaPipe and TwelveLab's Pegasus models for the video analysis and computer vision machine learning logic. And lastly, we used OpenAI's Whisper and ChatGPT models for the audio transcription and feedback analysis.

Challenges we ran into

We had several issues with the computer vision technologies, as we were trying to use the current Pegasus documentation to work with an outdated version of the model. We struggled to debug this issue and made the decision to switch to a different model after a few hours. Now we are dedicating our efforts to improving the lightweight MediaPipe model we pivoted to, hoping to include features and functionality that our original approach wouldn't be able to support. Another challenge we ran into occurred when we attempted to deploy our backend to AWS. We had experience with Google Cloud and Azure, but not with AWS, so there was definitely a learning curve in this journey. And as with all deployment tools, there are about a million error codes that you don't understand, and it always works locally, but never on the hosted platform. After hours of painful debugging, we finally managed to get our hosted backend working, and we are excited to move on!

Accomplishments that we're proud of

We were able to get the transcription and analysis technology working very quickly, which allowed us more time to debug our other issues and beautify our frontend, which we were all very proud of. Shoutout to Trisha for getting that working so fast. We were also able to immediately pivot our approach when faced with intimidating bugs, ensuring we had a working project while continuing to debug the more robust model system. Shoutout to Neha and Trisha for adapting so quickly there!

What we learned

We have learned so much through this experience, getting hands-on experience in new technologies, such as Next.js, and working with hosted models. We have also learned critical teamwork skills, in delegating work for efficiency, and still being able to quickly come together to face a significant problem as a team. We learned to document our work/understanding early, to communicate with our teammates as soon as possible, and to maybe read some documentation before diving straight into code.

What's next for InterView AI

Currently, as stated earlier, we are working on making our computer vision system more robust. We are also planning to connect several more APIs, including an API of interview questions, to pull more dynamic, adaptable data into our product. We hope to also include other models, such as some of Bedrock's LLMs, to further improve our interview feedback and guidance.