Research shows that women and minorities talk less and are interrupted more often in meetings. This has been presented by Intel’s very own Beenish Zia and Nina Lane (more here: http://blog.meeteor.com/blog/women-in-meetings/). The first step to solving this problem is to expose the data – which is what INTELigent is designed to do.
What it does
INTELigent records a meeting via a stationary camera and uses computer vision to understand when any particular person is talking. Then, we create a timeline that provides a first insight into the distribution of talking time.
How we built it
In our current prototype, we use an Android phone to record meetings via the back camera. The recorded information is sent to a Python web server, which uses a pipes-and-filter architecture to compute who is currently talking. We achieve this by analyzing the facial landmarks of each person. The result is sent back to the phone, where we aggregate the current data and present it to the user in an easy-to-digest interface.
Challenges we ran into
Detecting if a person is speaking from camera input is far from being a trivial problem. Setting up the required data pipeline was quite challenging.
Accomplishments that we're proud of
We were initially skeptical about using live image analysis given its considerable work load on the backend system. However, through a variety of optimizations, we were able to push the amount of data transmitted over the network to a minimum and arrive at a close-to-real-time performance.
What we learned
The importance of trimming down an idea to its essentials (did I hear “MVP”?). While we came up with many more ideas for what we can learn from camera features, we learned pretty quickly that in order to deliver something in 36 hours, you have to prioritize ruthlessly.
What's next for INTELigent
With the raw data that INTELigent records and analyzes, we can provide insights into many different aspects of inequality in meetings: how often women are cut off prematurely, whether the team lead is using the meeting mostly for self promotion, or if the junior team member ever gets the chance to say a word. This can be achieved by not only looking at the time spent talking, but by analyzing things such as posture, body language and emotion. We believe that with this data, companies can become a more open and collaborative environment for everyone.