Training dataset for the digit recognizer.
Training sample for the page recognizer.
You all probably know someone who’s a very good designer but not very good at CAD. CAD software takes a while to get to learn, and it’s very hard for some people to learn it. Most of the time, these people can make pretty good sketches on paper (how else did you know they were a good designer?). Using CAD shouldn’t be a limiting factor for these people: there should be some way to take their good hand drawing skills and directly translate that into CAD. Even for those who have strong CAD skills, a computer isn’t always handy: maybe a bar napkin is the closest thing you could find when you came up with your million-dollar idea. There should be a way to get these fast but precision drawings into CAD. This is already possible by tracing tools in software like Inkscape, but the tracing tools don’t know anything about mechanical CAD and produce parts with lots of little lines. In addition, the parts have no scale, even if you write one on the drawing, so you have to figure out how to scale it.
The objective of this hack was to automatically ingest hand-drawn informal drawings into CAD software.
Does it Work?
Yes.(enough) If you hand me a hand drawing with some dimensions on it, I’ll take a picture of it and have a scaled DXF drawing compatible with any CAD software in about ten seconds.
How does it work?
1) Run an custom trained SSD deep learning object detector which detects three classes of lines on the page: the object itself that you want in CAD, dimension lines, and the numbers for the dimensions.
2) Look at the detected object and assume that for most machine parts, lines are going to be up-and-down or left-to-right, the same assumption that CAD software makes (snap to horizontal/vertical). Using this assumption, we numerically optimize an enhanced version of the part with meaningful lines (no pointless short ones).
3) Look at each detected dimension line. Figure out which line on the part it goes to (look at the closest one).
4) Figure out which number goes with which dimension line.
5) Read the number. Turns out there aren’t good systems out there for this (wasn’t supposed to be hard) so I made my own. It is another custom trained SSD deep learning object detector.
6) Scale the part to match the dimensions.
7) Save the part as a DXF.
What did I make during the Hackathon?
I made two custom deep learning object detectors: one to detect the relevant parts of the hand drawing and one to read handwritten numbers. For both I made the training data myself with colored pencils and paper (19 training drawings and 3 validation drawings), scanned it in, labeled it, trained the SSD object detector, and validated it. I put all the pieces together described above with traditional computer vision in OpenCV in Python.
Anaconda Python did not have a sufficiently up-to-date version of OpenCV or MXNet, so I built those myself.
What didn't I make (pre-built stuff)?
The object detectors were pre-designed by the GluonCV people and the researchers who actually invented the SSD object detector. MXNet is open-source and primarily authored by Amazon. OpenCV is open-source and primarily authored (nowadays) by Intel. I copied some of my own public code on GitHub for an unrelated project to make example drawings for me to draw. None of this code actually does the work of DeepCAD, it just let me make the training data in an unbiased way.
What's left to do?
The output object doesn’t have associative dimensions like SOLIDWORKS or Autodesk Fusion 360 does; the size is fixed when it’s saved as a DXF file. This information could be saved to make an even more useful output file. The input drawing has some restrictions; it has to be one shape with no holes, and it can’t have curves/arcs. Those will be split up into a bunch of little lines, which isn’t semantically meaningful.
The handwritten digit detector doesn't work that well. I'm really surprised I couldn't find a good handwriting detector out there in the wild to train from. This will take more work to make it accurate.
This is a tech demo and doesn’t have a whole lot in the way of UI/UX. This is good enough to write a research paper for, but not good enough to give to the target audience, which needs an especially easy to use product.
What am I proud of?
I’m amazed I could make an object detector from scratch in the amount of time I had. I didn’t even give it that much training data, and it generalizes better than I thought it would to out-of-domain drawings. It even runs at an okay speed on my laptop, which doesn’t have a fancy GPU, which means you might be able to get this whole thing to run on a cell phone. Training time wasn’t that bad.