Inspiration
As busy students, we take lots of photos. Of friends and food, mountains and beaches, but also of class notes and whiteboards. Come finals season, we desperately look for all the notes we can study and get jealous at the photos other people are posting on social media. To find that one handout we were sure we took a picture of or that one picture of spring break taken by the friend who can actually take good pictures, we've scrolled through years of photos on our phones.
What it does
Photo Ally does three main things:
- Decides whether the photo you took (in our app) was a class note and puts them in a different album.
- Helps you tag the photo right after taking the photo
- Provides a search function for finding tagged photos
How we built it
_ Building the Machine Learning Model _ We built a machine learning model using fastai. Starting with a pretrained, Resnet-18 model, we trained it to recognize the difference between handwritten notes and normal images. Then, we converted the model into an Onnx model, which is a middleman model to help put the original model into production faster. Finally, we converted the Onnx model into a CoreML model, which we exported for use in our iOS application.
_ Building the iOS Application _ We started by building a basic camera, then instead of saving the photo immediately, we pass it through our ML model. If it is recognized as a note, we save it into a separate album. Otherwise, it is saved as usual. Then, we built in two more screens for the tagging and the search results, along with all the gestures and UI to make the process seamless for the user.
Challenges we ran into
- For our machine learning model, we started with Resnet-34, which is much more popular. However, that overfitted our data and new pictures of notes could not be recognized as notes. We switched to a smaller model, Resnet-18, to solve this challenge.
- Unbelievably, writing
myCoreMLModel.predict(image)doesn't always work (or we couldn't figure out how to use it properly). We used VNCoreMLModel and a more complicated process to properly predict photos, a process which was used in Apple's online example.
Accomplishments that we're proud of and what we learned
- This was our first time building a full, iOS application. While some of us had studied a little Swift and used xCode beforehand, searching the solution to do every little thing was challenging and rewarding. We've learned a lot.
What's next for Photo Ally
We have several ideas for features that can be expanded upon:
- Separate the notes by class, using time the photo was taken and what is written on the notes as data
- Creating automatic tags using machine learning
- A more flexible search algorithm, i.e. recognizing images with tags that have similar meanings to the search
Log in or sign up for Devpost to join the conversation.