Inspiration

The future of entertainment and UX will be personalized experiences highly tailored to the target user. Imagine future where the show you watch "Just get's it." or the the experience of an app "Just feels right all the time." To get there, we need a deep understanding of the emotional landscape while watching shows and using products. However, during user testing it’s tough to observe all the nuanced behaviors that participants feel. We were inspired to design a software that not only helps take notes and synthesize them, but captures the nuance of human emotion and attention by cross comparing expressible behaviors; facial expressions, tone of voice, and length of pauses to the content being studied.

What it does

Sentimetrix successfully leverages AI across multiple vectors (many axis) to provide more enriched user research data and nuanced insights. When engagement metrics are flat, our tool can help businesses pinpoint why. With the analysis and organized aggregation of data, we can improve the efficiency of sentiment gathering at scale. We designed a time series that maps facial expressions, tone of voice, and eye tracking at intervals of 10 seconds. This multi-factor approach empowers streaming services, content creators, distributors, and film studios to gain a comprehensive understanding of user experiences and uncover the "Why" behind engagement levels, bridging the gap between high and low engagement.

How we built it

We used a python Flask backend to manage and store user interview data, which included time series analysis over user events, facial expression, and dialogue. Facial expression were analyzed via a CNN (Convolutional neural network) to classify web cam images of participants into mood. We leveraged ROBERTA perform sentiment analysis over user dialogue. Finally to patrician the interview timeline into designated tasks/events, we utilized GPT.

Challenges we ran into

Our dual screen-capture functionality. Capturing screenshot of the participant's screen and the embedded iframes and photos of participants from webcam to perform eye tracking analysis and facial expression.

Designing a system to deal with asynchronous processing of long-running machine learning and classifying tasks proved to be much more complicated than originally expected, but a necessary hurdle to overcome with regards to scale.

Accomplishments that we're proud of

A fully functioning backend that supports data persistence and is able to be completely hosted with all API endpoints working correctly. Endpoints include: starting a new interview and generating an interview id, receiving image data from the client and processing it, ending interview and updating state, receiving vtt text transcript data post-interview and processing it, retrieving and parsing interview data from persistant storage. All integration with AI and ML models is fully completed and working properly, image facial sentiment analysis and classification is completely finished, we're able to use AI to partition transcripts and timestamps into relevant events, and we're able to use sentiment analysis on text transcripts.

What we learned

Prioritization and planning of features was a big struggle for us throughout this hackathon, we had an ambitious MVP and even more ambitious extensions, without enough prior planning to indicate the actual quantity of work that would be required for the implementation of all of the features. Also, when it came to some vital components of the design, more specifically with regards to capturing participant screens, there were a lot of assumptions that were made by our team members with how much functionality was allowed to us via the browser, and without enough research. Delegation was also a skill that was improved and worked on throughout this process. With one codebase and multiple developers it begins to take active effort to make sure we're getting the most productivity out of our team members.

What's next for Sentimetrix

Built With

Share this project:

Updates