We have all been in meetings that were really constructive, as well as meetings that were frustrating and unbearable. No matter what the outcome is, there is always room for improvement for the next time. However, work meetings happen so often that we rarely get time to sit and analyze its shortcomings. So, what if an application can help us seamlessly integrate with the daily conference workflow, by providing an automated measurement of the participants' attitude, participation, and emotions in relation with meeting topics.
What it does
ConvoBuddy provides a webinar system tailored for professional online meetings that promotes engagement and feedback by capturing live meeting data and generating real-time sentiment analysis.
How we built it
We built ConvoBuddy as a web application supported by Node.js, Express.js, and Angular.js, hosted on Google Cloud Engine. The first thing the user sees on website is the in-conference recording system, where the app periodically captures the participants on a set interval, using the webcam. In parallel, it also records the audio of the entire meeting via the microphone. We use this collected data to feed Google's machine learning and prediction APIs, including Google Cloud Speech, Google Cloud Vision, Google Natural Language Processing, in order to generate sentimental analysis results. This is done by first sending the audio file into our Node.js backend, where it converts it into a FLAC format compatible with Google's text-to-speech API. After retrieving the transcript data, it is then fed into Google's Natural Language Processing API to generate an overall sentiment value and magnitude. Finally, we give the list of images captured during this session to the Vision API, where we retrieve emotional analysis results of the detected faces.
Next, we have implemented our own algorithm to aggregate and analyze these raw data, generating a holistic score-based report. This report highlights a holistic sentiment score on a scale of 0 to 10, as well as the predominant emotion based on all of the faces of the participants. It also provides a relative analysis of the specific emotions that were used during the meeting calculated using a weighted average formula.
All of these features are held together with a completely customized, clean, and modern UI that provides hierarchy to the report, using colors, bar graphs, and pie charts to visualize this valuable information. We also provide a live camera view for the user in-call, very similar to the profession-grade webinar systems, such as Skype and Cicso Webex.
Challenges we ran into
- Integration with multiple Google APIs, performing analysis on retrieved data, and generating a user-friendly report in a tight timespan
- First time deploying the application on Google Cloud Engine, ran into issues with Docker and dependencies
- Designing, prototyping, and testing a UI system completely from scratch that is tailored for this usage.
- Using web as a platform for recording and converting audio and photo via browser-enabled microphones and webcams.
Accomplishments that we're proud of
- Distilling features based on user needs and fulfilling these needs accurately by chaining three of Google's machine learning APIs.
- Generating visually appealing reports supported by graphs and charts, based on raw data.
- Achieving high degree of prediction accuracy through collecting raw audio and image data with system hardware
- Learning multiple new APIs and libraries and deployment environments.
What's next for ConvoBuddy
1. Live Feedback
We plan to call the Google APIs with live streaming data and provide a live feedback for each participant in a meeting. This will be exceptionally useful for maintaining and monitoring in-meeting coherence and participation.
2. Individual Assessment Scores
Currently our web app supports does not distinguish between individuals. We plan to map each user's voice and face to their corresponding profile, in order to generate reports on a more granular level.
3. Topic-based Sentiment
Similarly, we can provide a more fine-grained analysis on the meeting by retrieving the meeting sentiment over specific topics. These topics can be either extracted from the meeting transcript, or supplied manually by a meeting organizer.