As society becomes paralyzed by the spread of COVID-19, more and more people find themselves staring at their computer screens while working from home. It is expected that many people will soon find themselves experiencing the harmful effects of prolonged screen time. We want to create an assistant that can recognize the signs of digital eye strain and alert the user.

What it does

Ocular Aid uses computer vision technology to detect symptoms of digital eye strain. In addition to this, face detection is used to calculate a user's total screen time and users are able to set periodic alerts that remind them to rest their eyes.

How we built it

Ocular Aid is centered around the fact that humans blink at a slower rate when fatigued are experiencing eye strain. Using OpenCV, human eyes and their bounding boxes are detected with Haar cascade classifiers. The images are then cropped to only contain a single eye. The images are then classified as either open or closed by a pre-trained Densenet-121 feature extractor that is attached to a fully-connected classifier (trained to 96% accuracy on testing data). Ocular Aid's desktop application was created using C# and its WPF UI framework.

Challenges we ran into

  1. A majority of our group had little to no experience with C# and WPF.
  2. The Haar cascade classifiers had some issues when the user was wearing glasses.

Accomplishments that we're proud of

We hacked out a full blown Windows app that has so many real-world applications in just 24 hours! Not only that, but we learned a lot.

What we learned

  1. We gained lots of experience with C# programming.
  2. We learnt about Haar cascade classifiers.
  3. We became much more experienced with OpenCV and image processing.

What's next for Ocular Aid

We'd like to have Ocular Aid take more factors than just blink frequency into account when detecting eye strain. There are also many possible optimizations that should be made. For example, we work with greyscale images of eyes while the convolutional neural network accepts 3 channel RGB images as inputs. This means that the neural network has more parameters than are actually needed.

Share this project: