We think that Invelon proposed an interesting challenge. We especially liked the 3D aspect of it and its difficulty. So, we wanted to challenge ourselves by being as ambitious as possible to see how far we could get!
What it does
A worker in front of a 3D printer with a print finished can take and upload a picture of it from their phone into our web app Then, our server finds the
.stl file (from the ones we were provided) that best matches the picture.
How does it internally find the match?
- The image is cropped to be square and a fixed resolution.
- The image background is removed with an open-source library written in python and based on the deep learning u2net architecture. (We initially tried to code our own, but we gave up due to the difficulty and that it was not the main focus).
- The processed image is compared to a set of renders of the
.stls (more details on this latter). It uses two algorithms in combination to compare how similar the photo is with a render: the average hash and the perceptual hash. We return the
.stlthat scored higher in the similarity comparison.
- Finally, we gather the related information of that
.stl(e.g. customer info) from our database. This complete information is returned to the frontend.
What about the “renders”
To compare 3d models with photos, we evaluated 2 approaches:
- Use deep learning to find matches.
- Create renders of the 3d models from many angles and do simple image comparison. We discarded a. because we were worried that we wouldn’t be able to train the neural net properly and get satisfying results. Also, b. seemed more achievable because we could split the problem into smaller ones, split work better, and increase our chances of success.
The renderer internally uses
Three.js running in
TypeScript). It loops through the
.stls and for each one it renders 6 images (top, bottom, left, right, front and back). Then the images are saved in the server’s local storage, to then be accessed by other processes, in real life this would be a bucket in some cloud.
How we built it
For the FrontEnd part we used Vue.js with typescript to develop all the platforms and also, three.js as the library to show the models. For the backend part we used python (w/ Flask) to develop all the backend with opencv and also everything integrated with docker and sql for the database.
If you take a look in our GitHub, we have a monorepo with all the little projects on it, each one is in one different folder, you can have more information in the Readme for each folder but basically we have:
matcher-clientthat contains the front-end where you can upload the 3D model
matcher-serverthat matches the image with models we have in the renderer.
renderer-serverthat stores the Models and Renders of them to be used in the matcher.
You can see a picture of the planned architecture in the project media shown on the top.
Challenges we ran into
On a technical level:
- Colorless stl: it’s difficult to match pictures of different colors
- Background removal to improve accuracy: we could not do this properly until we used a deep-learning-based approach
- Rendering stl in headless browser. Three.js is a wonderful library to do this kind of rendering but it counts on having a browser window to display the scene. To programmatically create the renders, we had to use puppeteer as a headless browser automation tool.
- We wanted to create a scalable solution, so creating a DL classification model was not the optimal solution. This is why we ended using and trying old school ML solutions as hashing or SIFT.
On top of that, we’d add lack of time and tiredness, the usual suspects! Also, one of our team members is participating online and it's harder to communicate and develop the project in less time if we are not together.
Accomplishments that we're proud of
We are proud of the project we developed, we are proud of getting it to work. Also we are proud of following good code practices in our development, despite finally not being able to do all the testing we wanted at first because of the lack of time.
What we learned
During these 24 hours (a bit less because we slept) we learnt how to render STLs and also how to compute approximate similarity on images. It was a big challenge because we learnt that, because it's a 3D model, it can be in a lot of different angles so it's really difficult to do a match and if the figures are similar, it's difficult to differentiate between them.
What's next for Identificador de impressions 3D per imatge
Maybe add more models and also if there is enough data maybe train a machine learning model to be more effective in the render so you can upload more pictures, nowadays there are just a few models.
Log in or sign up for Devpost to join the conversation.