Photo glasses to capture picture which API then analyzes for emotion recogniton
UI display of emotion AI recognition
UI display of emotion AI recognition
Spy glasses provide photo capture for image analysis
Built with Windows IoT Core
The //Oneweek Hackathon is Microsoft’s global hackathon dedicated to bringing employees together to collaborate on projects that promote Microsoft’s Mission to “empower every person and every organization on the planet to achieve more.” This project combined artificial intelligence with IoT technology to help children with autism recognize facial emotions. Our project was inspired by a team member’s son has autism and struggles to connect and hold conversations with others.
How Might We: Many children with autism find reading emotions difficult which limits their social skills and ability to make friendships. Through the hackathon, our team considered the following question: “How might we use technology to empower , aid, and improve emotion recognition and conversation for children with autism?”
This project explored the use of real-time image recognition and processing with AI from Microsoft’s Cognitive Services Emotion API. With recognition results, we then provided auditory prompts for conversation based on a recognized emotional context.
Our team conducted research and feedback sessions with children and their parents as we developed ideas, created prototypes, and iterated with code. We also consulted with therapists and, as the project developed, shifted focus to developing a learning tool for children to use at home and in therapy sessions.
Research and Analysis
From conducting user-interviews with a small group, our team made important discoveries around the desire for portability and for technology to not impede social interaction:
Conversations with therapists provided these insights: Children with autism who struggle with emotion recognition often show a lack of eye contact which could be tied to difficulty understanding emotion. Recognizing facial expressions can be difficult to separate from vocal intonation and body language. While children struggle with emotion recognition, they often have an enhanced ability for ‘systemizing’ which is the desire to analyze, build, and predict system behaviors.
Design goals: The solution should not be reliant on a mobile phone. Any image capture should be simple and not impede social engagement. Recognition in real-time can assist children in conversation. Feedback about recognition should be delivered in a sensitive way.
Outcome from research and analysis: The design goals led to proposing a system with a pocket-sized device running Windows IoT Core as the computer/processor. For image capture, a spyglass camera was proposed. With the Spy Camera, there is no need to have to take out a phone or camera while in conversation. The spy camera also encourages the child to keep eye contact for an image to be captured for recognition. Recognition would take place with Microsoft Emotion Recognition API in the Cloud.
Components and Prototyping
For each area of our project the following components were chosen:
Image Capture: A spyglass camera is used for image capture.
Processing: A Mino board was chosen to run Windows IoT Core. Its size allows easy transportation; running Windows IoT Core on the Mino is like having a computer in your pocket.
Emotion API: The emotion API running in the cloud processes images. The API performs high-level “cognitive tasks” pulling insight on data based on trained ML models and is able to run on any hardware platform using C#. The API allowed us to know emotion(s) detected and the level of confidences of emotion(s) recognized.
Output: Headphones are used to deliver audio feedback based on emotion.
When would images be captured? To prevent continuous image capture and recognition, the child can decide when they feel they want emotion feedback. When they decide this, they can press a button to capture a photo of who they are speaking with and that image is then sent to the cloud for analysis.
How are users told what emotion is recognized and how is feedback delivered? With the emotion API, multiple emotions are recognized and ranked based on percent certainty. For this project, we can tell the child the top emotion recognized and deliver them a prompt based on that emotion.
The following are prototyped audio responses to help guide conversation based on the emotion recognized.
For example, if a child is engaged in conversation with someone and captures a photo, and the AI decides that the primary emotion is “happy” the audio prompt may be: “Happiness.” Ask, “How is your day going so far?”
Similarly, if “sad” is recognized, the audio prompt may be, “Sadness. Ask, “Is everything ok?” or if “angry” is recognized, the prompt may be, “Angry. Ask, “Are you upset?”
From these design considerations and prototyping, giving audio feedback proved to be especially important for users on the higher end of the autism spectrum who struggled with finding appropriate responses in conversations with others.
Technical Challenges: While the project was initially intended to provide real-time recognition, the nature of using a cloud-based API caused a time gap between when photos were captured and when suggestive dialog was provided. A possible solution would be to run the emotion recognition algorithm locally instead of relying on the cloud.
User Interface (UI) design expanded the scope of the project:
Real-time recognition proved challenging. Additionally, real-world applications for children to wear Spy Glasses in public settings could infringe on the privacy of others since all images captured may not be consensual.
Pivoting our idea to be used for training sessions Our team decided to pivot our idea to focus on children using the emotion recognition system as a training tool at home and in therapy sessions. I worked with a front-end designer to transition how audio prompts could enhance displayed information. Our app, “IoT Emotion Analyzer” allows children and therapists to view captured images together. The app lends to discussion and can be used as a guessing game on emotion.
Outcome and Next Steps
At the end of the hackathon, we had children come and try our system. We observed that the children enjoyed turning emotion recognition into a game, visualizing results through the UI.
Some of the children and others that have followed them are able to use our application as a learning tool at home and in therapy sessions. Our team received encouraging feedback that suggests children have demonstrated increased emotional awareness and confidence to engage in conversation.