2-D Spinal Segmentation & Fracture Detection

Segmentation Outputs
Fracture Detection

Inspiration: While observing radiologists marking up spine X-rays for hours, we felt that automation would cut down on numerous hours, especially with the prospect of using AI; however, we also appreciated the difficulty with which existing models handled the complexity of spinal anatomy. The VerSe dataset provided us with the appropriate basis to tackle this clinically pertinent problem that feeds directly into patient care.

What it Does: Our tool automatically detects and segments individual vertebrae from adjacent CT scans in seconds, providing clean binary masks suitable for radiologists' surgical planning, fracture detection, and disease monitoring purposes. In this setting, an entire 3D volume is processed, with the 2D sagittal slices extracted suitably to perform the spinal analysis. The segmented masks are then classified in a pipeline that separates individual vertebrae as fractured, normal, or not-a-vertebrae. The API returns color-coded overlays on the axial plan of the vertebrae (red for fractured, green for normal) along with a gallery of individual crops from the vertebrae with their associated scores, which, in essence, converted a highly involved radiology workflow into some simple API calls.

How We Built It: We fine-tuned Meta's Segment Anything Model (SAM) with LoRA adapters on 1024x1024 spine images, creating a preprocessing pipeline that applies bone windowing and morphological operations to isolate the vertebral column. The entire workflow handles NIfTI files and consistent reorienting while automatically generating training pairs. The fracture detection pipeline was started off with Flask's skeleton, bolted OpenCV onto it for watershed segmentation, and integrated ResNet, EfficientNet, and MobileNet, which were trained using a synthetic dataset. We wrote our own bounding box logic and base 64 encoding to stream results directly into a frontend without having to temporarily save everything to disk.

Challenges We Ran Into: SAM's multi-mask output was tricky to work with - we had to implement IoU-based mask selection to pick the best prediction from three candidates. Some subjects like sub-verse801 had unusual anatomy that initially tanked our metrics, forcing us to debug why certain vertebrae were being missed entirely.The watershed algorithm also had a nasty habit of over-segmenting, so we spent days tuning markers and fighting with morphological operations to stop it from splitting single vertebrae into fragments.

Accomplishments We Are Proud Of: Hitting a 0.924 Dice score that rivals published state-of-the-art methods feels incredible, especially since we're just a small team with limited compute. Seeing the model correctly segment challenging cases with fractures and anatomical variants was an accomplishment we were proud of.

What We Learned: LoRA fine-tuning is ridiculously efficient-we achieved medical-grade performance with just eight trainable parameters per layer instead of millions. We also learned that proper preprocessing (bone windowing + spine extraction) matters more than how complex the model is for this task. Deployment of Keras models is trickier than it looks-version mismatches between training and inference environments can silently corrupt predictions.