Inspiration Searching large CAD libraries is usually a manual, file‑name‑driven grind. We wanted a way to find “things shaped like this” even when naming is inconsistent, folders are messy, or the model is new. The goal was a fast, visual, geometry‑first search that feels natural for designers and engineers.
What it does auto_cad_model_search indexes a folder of STL files and lets you search for similar CAD models by shape. You can pick a query model from the dataset or upload a new STL, then get the top‑K closest matches with previews. It supports a fast, offline local embedding pipeline and an optional Gemini‑powered embedding backend, plus adjustable distance metrics and weights to tune what “similar” means.
How we built it We built a Streamlit app for the UI and a Python backend for geometry processing and search. The local embedding backend computes shape descriptors from meshes: radial histograms, D2 shape distributions, size features (area, volume, sphericity), extents, and topology counts. Meshes are normalized for translation/scale, with optional scale features and log‑scaled stats. We store embeddings, per‑feature statistics, and metadata on disk so indexing only happens once. For the Gemini backend, we convert a mesh into a compact textual descriptor (extents, aspect ratios, sphericity, vertex/face counts, plus heuristic tags) and embed it via the API. Search supports cosine similarity or L2 distance, with optional standardization and per‑feature weighting. Results are shown with fast PNG previews or optional 3D Plotly renders.
Challenges we ran into Normalization was tricky because we wanted scale‑invariant similarity by default but still needed an option to respect size. Another pain point was balancing accuracy vs. speed when sampling points for histograms and D2 metrics. We also had to handle inconsistent STL files (empty meshes, scenes, missing faces) gracefully while still keeping indexing fast and reliable.
Accomplishments that we’re proud of We delivered a full end‑to‑end workflow: drop STL files into a folder, build the index once, and search instantly. The UI makes it easy to tune similarity weights without rebuilding, and the dual‑backend approach lets users choose between fast offline embeddings and higher‑cost API embeddings. It’s already useful as a practical CAD lookup tool.
What we learned Lightweight geometric descriptors go a long way when paired with good normalization and weighting. Standardizing embeddings helps distance metrics behave consistently. We also learned that clear progress feedback and preview rendering are essential for a good user experience in geometry‑heavy workflows.
What’s next for auto_cad_model_search We want to add support for more file types (OBJ/STEP), approximate nearest‑neighbor indexing for very large libraries, richer metadata search, and better evaluation tools for similarity quality. We’re also exploring hybrid search that blends geometry descriptors with multimodal embeddings and part‑level segmentation.
Quick security note: I noticed an API key in your .env. If that key has ever been shared or committed anywhere, it’s best to rotate it and keep .env out of version control.
If you want, I can tailor this for a specific hackathon format (Devpost, README, or a short demo script) or align the tone with your team’s style.
9:07 PM
Built With
- langraph
- llm
- streamlit
Log in or sign up for Devpost to join the conversation.