Inspiration

As we've been to some hackathons by now, we got pretty used to presenting in front of a large audience. However, we're still very far from professional presenters, such as TED talks speakers. Of course, the best feedback you can have on your presentation is the public's reaction. But how can you measure your impact on the public, especially if you are busy managing the stress of giving the talk?

What it does

Using a camera, you can record the audience while you are giving a presentation, and then upload that video on Cognitive Services web dashboard. It extracts frames every 2 seconds and continuously analyses the public's emotions, plotting them against the presentation audio/video. It calculates the attention of the public for each frame, plotting it so that you are able to see when exactly in your talk there was an attention drop. You can also check if people reacted to what you said in the way you wanted. For example, if you said a joke 20 seconds in, but there was no indication of an increase in happiness around that timestamp, it could mean there is room for improvement.

How we built it

We created a web dashboard in HTML, javascript and CSS, powered by PHP as backed. It processes the upload and then executes three python scripts that split the video in key frames and key audio's, perform the API calls to Microsoft Cognitive services and get the emotion analysis for all the frames and the sentiment analysis for the text, and format everything in a JSON file, while also calculating the attention and other analytics. Then, back in the web dashboard, the JSON file is read into multiple javascript functions that plot the results.

Challenges we ran into

Biggest challenge was getting opencv to run. There were numerous dependencies to compile from source, not available as binaries, and it did not work from the first time. Also, we ran into problems making it work on Windows, which made testing harder, because on Macs we had problems getting the apache server to work properly. Another big problem was the initial limitation of API calls, which only allowed a maximum of 20 seconds of video. In the morning we received an Azure API key from Microsoft, which allowed us to implement a working solution. Moreover, customising the audio/video player was a problem, and integrating it with the graphs, in order to properly show the data at appropriate times.

Accomplishments that we're proud of

We completed a fully working MVP which is actually useful as it is, delivering value to the user even in its current form. Because there are a vast amount of possible uses for Cognitive Crowds, we believe it has a strong business potential. We think this is the most important, because once you prove something can be done, half the work is done.

What we learned

We learned a variety of new things, such as image processing with opencv, building complex pipelines and integrating different technologies, data analysis and much more. Most importantly, we realised how useful it can be to analyse the emotions of the people and how many applications it has. If it is used in combination with other technologies such as text analysis, it becomes a very powerful tool that can provide a very accurate constructive feedback.

What's next for Cognitive Crowds

The current implementation is just barely touching the huge potential of this idea. We plan to use it in combination with a variety of regression algorithms to correlate the key phrases of your speech with the public's reactions, so it can automatically suggest what subjects you have to touch in order to inflict a given emotion to your audience. As a business model, we can provide the software as a service to venues/event organisers, in order for them to attract better speakers and to help them improve themselves. Also, it can be provided to university lecturers, helping them with as-objective-as-possible constructive feedback about their classes (everyone knows students usually over-grade their lecturers in the end-of-the-year feedback forms).

Built With

  • javascript
  • microsoft-bing-speech-recognition-api
  • microsoft-cognitive-services-api
  • microsoft-emotion-api
  • microsoft-text-analysis-api
  • opencv
  • php
  • python
Share this project:

Updates