Inspiration
In manufacturing, engineers constantly switch between 2D technical drawings (PDFs) and 3D CAD models to verify tolerances, annotations, and feature specifications. This manual cross-referencing is slow, error-prone, and a bottleneck in quality assurance workflows. DrawMind3D was born from the question: what if GenAI could automatically link every annotation on a drawing to the exact geometric feature it references in the 3D model?
What it does
DrawMind3D is a GenAI-powered tool that automatically extracts annotations from PDF technical drawings and matches them to cylindrical features in STEP/CAD 3D models. Upload a drawing and a CAD file, and the system produces a structured JSON mapping of which annotation belongs to which 3D feature — achieving 96.8% linking accuracy and a 73.9% F1 score across 21 standardized test cases. The interactive web viewer lets engineers visually verify matches in a side-by-side PDF and 3D view.
How we built it
The pipeline is a hybrid of classical geometry processing and GenAI vision. On the CAD side, OpenCASCADE (OCP) extracts cylindrical features with precise diameters, depths, and spatial coordinates from STEP files. On the drawing side, PyMuPDF and Tesseract OCR extract text annotations, while Gemini Flash Vision (via OpenRouter) interprets complex GD&T callouts and leader line targets that pure OCR misses — boosting F1 from 30.9% to 73.9%, a 139% improvement. The matching engine uses a 6-factor weighted scoring matrix (diameter 45%, type 22%, depth 18%, count 8%, uniqueness 4%, spatial 3%) solved via the Hungarian Algorithm for globally optimal assignments. Consistency is ensured through multi-layer validation: structured LLM prompts with explicit whitelists/blacklists for hole annotations, geometric cross-validation against the 3D model to filter false positives, unit-aware parsing with automatic inch-to-mm conversion, and deterministic matching via the Hungarian Algorithm. The frontend uses Three.js for interactive 3D visualization and PDF.js for synchronized drawing display, all containerized with Docker.
Challenges we ran into
Technical drawings are deceptively complex: leader lines cross, annotations reference features in different views, and GD&T symbols follow domain-specific conventions that plain OCR cannot parse. Achieving reliable extraction required combining classical text extraction with vision LLM interpretation, then carefully validating which source to trust. Geometric matching was another challenge — real-world drawings contain dozens of similar-diameter holes, and without the weighted multi-factor scoring approach, naive matching produced too many false positives.
Accomplishments that we're proud of
The 139% F1 improvement from integrating GenAI vision (30.9% to 73.9%) validates the hybrid approach over pure classical or pure AI methods. Our best test cases hit 100% F1 (SYN-05) and 91.7% F1 (FTC-07 from the NIST test suite). The system was evaluated against 21 test cases across 4 categories (NIST CTC, FTC, D2MI, and synthetic) with standardized ground truth, achieving 72.6% precision, 84.2% recall, and 96.8% linking accuracy — ensuring generalization beyond ad-hoc testing. Building a fully functional end-to-end pipeline as a solo developer — from PDF parsing to 3D feature extraction to optimal matching to interactive visualization — in hackathon timeframe was a significant accomplishment.
What we learned
GenAI vision models excel at understanding spatial relationships in technical drawings that rule-based systems struggle with, but they need structured prompting and validation to be reliable in manufacturing contexts. The biggest accuracy gains came not from the AI alone, but from the fusion of classical geometric analysis with LLM-based interpretation. We also learned that evaluation methodology matters enormously — using standardized NIST test cases alongside synthetic data gave much more confidence in real-world applicability than ad-hoc testing.
What's next for DrawMind3D
Expanding beyond cylindrical features to support planar surfaces, slots, and complex freeform geometry. Adding support for full GD&T semantic parsing (datums, tolerance zones, material conditions) to enable automated inspection planning. Integrating with PLM/MES systems so that the annotation-to-feature mapping can feed directly into quality control workflows. Long-term, DrawMind3D could become the bridge between design intent captured in drawings and automated manufacturing verification.

Log in or sign up for Devpost to join the conversation.