Inspiration
Facial analysis is a heated topic in Computer Vision. Smartphones nowadays have specialized hardware, making edge computing much more efficient. We want to make facial analysis more accessible to the public, and we feel the iOS platform is a good place to start. Therefore, we created an iOS app that detects human faces and provides analytics based on each face.
What it does
Given an input image, Minitrue will first run face detection to determine how many faces are there in the picture. If there is at least one face present in the picture, Minitrue will analyze the given faces and output a list with analytics for each face.
How we built it
- Finding a suitable model. We explore many image-based machine learning project to find pre-trained models which are both accurate and easy to implement.
- Making a demo. We cloned our target project from Github and made a MVP version of the APP to validate our idea.
- Implementing front-end interface. We used SwiftUI component to make our MVP more accessible and better-looking. We also add face detection and face cropping powered by Apple's Vision framework for iOS.
- Testing. We used face images of different people to test the robustness of our APP. We change the race, gender, and age of the face, as well as the number of faces in the picture, and check if our APP works correctly.
Challenges we ran into
CoreML
- Finding existing CoreML models for implementation.
- Figuring out how to use pre-trained CoreML models for inference. ### SwiftUI
- Understanding the lifecycle of SwiftUI views
- Learning how to use multithreading to create a responsive user interface ### Vision Framework
- Utilizing built-in face recognition functionality in Vision framework.
- Comprehending CoreGraphics image structures and its coordination systems.
Accomplishments that we're proud of
We successfully implemented the facial analysis machine learning model, which is quite accurate for nearly 90% of the input. We also implemented face detection and cropping that is able to find all faces in pictures where more than one person is present.
What we learned
We learned about
- how to use a pre-trained CoreML model
- SwiftUI components and usage
- iOS built-in Vision framework API
What's next for Minitrue
There are still things to do on this project. First, we want to create an interactive overlay on the original image for more fluid user interaction. Second, it would be nice if we could handle a real-time video stream directly from the camera. Also, our UI design is efficient but not elegant enough, and there are plenty of space for improvement.
We also found a great article on predicting the facial appearance given a person's voice (Speech2Face). We were not able to implement this model on iOS due to time constraint. We wish to convert this model to CoreML and implement it on iOS in the future.
Built With
- coreml
- swift
- swiftui
- vision
Log in or sign up for Devpost to join the conversation.