Clara, one of our team members, commuted two hours on a daily basis to downtown Toronto last summer, where she worked. These long drives during rush hour in combination with her lack of sleep led to extreme fatigue and limited attentiveness while she drove, which concerned both her safety and drivers/people around her. Drowsy driving causes 100,000 crashes, 70,000 injuries, and 1,500 deaths annually in the US. Our team found that Clara, along with millions of others in similar situations, could benefit from having a connected, in-car AI assistant that detects driver drowsiness and provides solutions to keep drivers alert, ensure safety, and prevent accidents.
What it does
The voice-activated Iris AI assistant for connected vehicles constantly monitors the driver’s eye-aspect-ratio and facial expressions using dashboard cameras or smartphone frontal cameras, in order to detect a driver’s drowsiness levels and facial emotions. Iris offers drivers a variety of features, such as suggesting drivers to take a break from driving by recommending cafes, restaurants, or hotels nearby. Iris also allows voice calling or texting for the driver to stay alert, and can suggest and play Spotify songs based on the driver’s facial emotions and mood. Finally, to ensure the safety of pedestrians and drivers around the drowsy driver, the driver will be notified of pedestrian and school crossings nearby and encouraged to take caution.
How we built it
The Iris AI assistant was built in multiple steps. Firstly, in order to monitor the driver’s eye-aspect-ratio, a ResNet custom facial keypoint estimation model was trained on the Google Cloud Platform using the open-source VGG Face dataset. Then, the model was used with OpenCV to constantly monitor and calculate the eye-aspect-ratio of the driver. The facial keypoint data points are also streamed to Google’s Cloud Vision API to constantly monitor the driver’s facial expressions and emotions.
Subsequently, whenever the driver is determined to be drowsy (as the eye-aspect-ratio approaches 0), the Iris AI assistant is invoked to warn the driver. The Iris AI assistant runs on a custom NLP engine that is able to identify user intent, with the aid of Google’s speech-to-text and text-to-speech APIs. Iris is thus able to parse the user’s voice input, understand the intent of the user and act upon it, and respond to the user through voice output.
Finally, multiple features for Iris were built using a variety of APIs with the goal of aiding a drowsy driver. Using the city of London’s Open Source Dataset and the Google Maps API, Iris is able to detect pedestrian and school crossing zones nearby, and warn the driver to be cautious when approaching these zones. Furthermore, the driver can ask Iris to search for nearby cafes, restaurants, hotels, or parking lots, in order to guide the user towards a safe resting spot. The Twilio API was also integrated with Iris so that the latter can access the driver’s contacts to call or text a friend, family, or emergency services. Finally, using the Spotify API’s recommendation engine, Iris can suggest and play songs on Spotify according to the driver’s current emotions and mood.
Challenges we ran into
We ran into a number of challenges when building Iris. Firstly, training the facial keypoint estimation model was challenging due to the limited compute power we had on GCP. It was also challenging to accurately calculate driver drowsiness using our custom model, and build our custom voice-command-based NLP engine to accurately parse and identify user intent. Finally, we had some challenges integrating the Google Maps API with the London Open Source Dataset in order to provide various warnings and location-based features to the driver.
Accomplishments that we're proud of
We’re proud of being able to fully build a functioning voice-activated AI assistant to help detect drowsy drivers. Specifically, we’re proud of being able to build a model that successfully detects drowsiness in drivers, as well as building our own custom NLP engine, and finally integrating our AI with external APIs such as Twilio and the City of London’s Open Dataset.
We’re also proud of our teamwork abilities, as we leveraged each team member’s strengths in front-end, back-end, data science, and business case construction to build a complete and comprehensive solution.
What we learned
We learned a lot while developing Iris, including learning how to utilize machine vision for facial keypoint estimation, as well as learning about how the eye-aspect-ratio and driver’s facial expressions can be used to detect driver drowsiness. We also learned a lot about how to build an entire voice-command AI agent from scratch, specifically concerning intent parsing and recognition. Finally, we learned a lot about how to integrate Google Maps’ geolocation capabilities and the City of London’s open-source dataset to provide more accurate location insights and warnings for the user.
What's next for Iris
We see Iris as an opportunity to ensure the safety of millions of drivers worldwide, both in countries with high traffic density regions and in multiple transportation industries such as trucking, which is one of the most dangerous professions. We want to further explore the possibilities of integrating Iris with vehicle data (such as fuel and engine status), as well as Apple watches/Fitbits to detect heart rates and other measures of drowsiness and wellbeing. With Iris, we believe we can further work with governments and car manufacturers to achieve higher levels of road safety, improve the in-car driver experience, and ultimately save lives.