Inspiration
Education has undoubtedly been affected by the pandemic. Around the globe lessons and lectures are being moved online, leaving a difficult scenario for educators and students alike. Whilst both groups have adapted extremely well to the pandemic there are issues with online lessons that could be improved.
In normal circumstances educators are able to pick up on non-verbal queues from their students, adjusting their pace and delivery accordingly. This important two-way communication helps students to remain engaged whilst ensuring they are not being confused or overwhelmed by difficult material. Unfortunately this breaks down when lessons and lectures are delivered remotely, with many students not wanting to share video, inviting the class inside their own home. The aim of this application is to allow for some of this non-verbal communication to be restored to the world of online teaching, helping allow effective education and ensuing that no student is left behind.
What it does
This web app takes video input of students and estimates their emotions. This is sent anonymously as feedback to educators, aggregated at the whole class level. This allows for teachers to identify when students are struggling with difficult material or hopefully that they are engaged and happy!
There is an option to manually override the estimated emotion, in the event that a student wishes to convey something other than the estimation provided and also the opportunity to send prompts to the educator for common scenarios.
How we built it
Front end
The front end is written using React, Bootstrap and SCSS. The interface has a minimal design so information can be easily seen at a glance, as this is a secondary experience to either presenting or studying. Using React and SCSS with the BEM framework helped keep components and styles reusable for improved performance and consistency.
Back end
The back end is a relatively simple Flask app that has two main roles. The app is firstly responsible for collating the students' individual emotions to turn it into useful statistics for the teacher. Perhaps more importantly, the backend is responsible for analysing frames from the students' videos to perform emotion analysis.
We realised that building our own model from scratch would likely take too long to train to a standard that would be useful. Instead, we opted to use Microsoft Azure's Cognitive Services to recognise emotions displayed on student's faces. The output of this is scores for each emotion: anger, contempt, disgust, fear, happiness, neutral, sadness, surprise. This allowed for an emotion to be assigned to the user by taking the maximal score.
The emotions provided, whilst commonly used amongst facial recognition studies, unfortunately aren't all very applicable to an education setting (we certainly hope students aren't fearful!). One emotion we were particularly keen to be able to pick up on is one that will have been experienced by most students at some point in their education: confusion. To extend the classification to include this, we trained a gradient-boosted decision tree model (BDT) to recognise the differences between faces that were confused and those that weren't, using the above emotional scores as features. We created our own datasets for this training by scraping Google Image search results for confused and non-confused faces.
Accomplishments that we're proud of
We're proud of creating an app that feels so solid and intuitive in such a short time span. It's been especially great being able to share the app to our friends and family and seeing them being able to use it and have fun with how the app can recognise their emotions.
What we learned
We went from knowing little about emotion recognition to creating an app around it. Along the way, we also learned a lot about cloud services, as even those of us that had used them before hadn't had as much experience with setting them up. There was also a lot of learning with how to make a full-stack application while we're distributed all around the world, in completely different time zones.
What's next for ReadTheRoom
Ideally we hope that this app could be included in existing video conferencing software, so that users don't need to start two applications. We would also like to extend the app to store the emotional analysis which could be played back alongside the recorded lesson if this was required.
Currently for the application to work, frames of the video feed have to be sent to Microsoft Azure's Cognitive Services, which could potentially lead to some privacy concerns. The problem with running the model on user's computers directly is that some less advantaged students might not have the hardware required to carry out this analysis smoothly. One option would be to transform the image into some privacy preserving feature space with a simple model locally, which could be sent to the cloud for analysis with a more complicated model.
Whilst we have targeted this tool at the education sector, we also think it would be very welcome across the board in more corporate settings. Big team meetings could gain an extra level of interactivity, ensuring that presenters or speakers could remain engaged with their audience. Voice recognition would be a natural extension here too, perhaps to pick up laughter in the homes of audience members - there's nothing worse than being unsure if your joke landed!
Log in or sign up for Devpost to join the conversation.