Introduction
Inspiration
[Jonny] Idea came to me a while ago, when I discovered a spider while in halls in the US. Unable to identify the spider, the room had to be evacuated (read; ran from the room, screaming). Given the advancements and increased access to ML and Computer Vision algorithms, I thought it should be possible to automate this process to at least give some guidance as to what species a spider might fall into.
What it does
The user can upload a picture of a spider, or other venomous creature (currently snakes and scorpions), which is then fed through an image recognition algorithm to determine the species the animal falls into. The precision of these algorithms range from 75% to 93%, depending on the creature, with recall rates of 65 to 85%. Although this makes the algorithm relatively heuristic, it could provide some form of initial guidance towards identifying a spider. In the event that the image matches two trained categories, both are shown on the screen.
Development
How we built it
The user's uploaded image is first fed through a trained Custom Vision recognition algorithm, which determines the type of animal (i.e. scorpion, snake or spider). Depending on this result, it is fed into a specialized recognition algorithm which identifies the specific species of spider, scorpion or snake. It should be possible to use this technique to overcome the limitations placed upon the Custom Vision API, as we could have more hierarchical classifiers (e.g. for tarantulas), which then feed into more specific species identifiers.
After the species is identified, data is pulled from our own custom API deployed using NodeJS, which enables us query our custom SQL database with further information about the risk if bitten, and background information on the creature. The front end also pulls data from Bing image search to provide users with a visualization of what a typical spider of that species might look like.
Challenges we ran into
Backend
The Custom Vision API is not designed to identify minor differences between similar objects, and hence when comparing two very similar species (e.g. the black widow and the redback spiders), that are difficult to identify visually, it takes a massive hit to the precision and accuracy of the classifier. As these species are restricted to two very different parts of the world, it would be possible to make a classifier for both, and to advise the user which one it is more likely to be based on their location.
We chose to keep the two separate classifiers, as well as the combined classifier, as we felt it was important to show the limitations of this technology, and to see how good we could make the algorithm. The final results were surprising nonetheless given the species similarities, and were achieved using a highly cultivated training image set for both spider species. The tag 'black widow / redback' was applied to both species, and had final precision of 94.1% and recall of 90.6%.
Image credit: link Bloomingdeadalus, Wikipedia
Final Precision: 57.0% Final Recall: 59.3%
Image credit: link Toby Hudson, Wikipedia
Final Precision: 55.6% Final Recall: 46.3%
Front End
Initially, when connecting our React front end to the Microsoft Custom Vision API, the component we used was base64 encoding the image we were submitting to the API, which was incompatible with the API. After extensive testing, the component was changed, which resolved the issue.
Our front end team had not used promises in React before the hack, which took some getting used to, however eventually they succeeded through much perseverance.
Embedding the image of the spider in the results page also posed a challenge, as a suitable API needed to be found, a field our front end devs had not worked in before. After the API was found, there were many challenges required to manipulate the image to embed in React.
Accomplishments that we're proud of
Two of our team members had never entered a hack before, another two had minimal experience, and we had one experienced team member. Most of our team had never worked together before the hack, and forming a team from the beginning and working together has been very rewarding. Getting an app to a stage at which it was deployable in 24 hours is a massive achievement to all of us, and one of which we are very proud.
We have all learnt a lot over the course of the hack, none of us had ever touched the Azure Custom Vision API before, and having the opportunity to try such a revolutionary software and determining its capabilities has been rewarding. Our front end team has also learnt about React, although if you ask them to do anything with CORS they might kill you.
[Jonny] I attempted this idea at another hack a few years ago, however at that stage there were no APIs that were mature enough to enable us to achieve what we have. Seeing this idea coming to fruition has been really good.
What we learned
So much. Everything? Literally.
What's next for Spidentify
The great thing about this project is it's so easy to expand. Custom Vision enables us to easily extend the project to further species and creatures. With time, it should be possible to build an even better recognition back end. It should even be possible to train an entirely custom back end that is better at recognizing minute differences in a specific animal.
Log in or sign up for Devpost to join the conversation.