Inspiration
Lung cancer diagnosis often relies on medical imaging, but traditional statistical techniques struggle to capture complex structural changes. We were inspired to explore how Topological Data Analysis (TDA) could identify subtle yet meaningful patterns in lung CT scans, offering a robust, coordinate-free method to improve cancer detection.
What it does
This project leverages TDA to analyze lung cancer images, extracting topological features such as loops, voids, and cavities from CT scans. Persistent homology, a key TDA tool, helps distinguish between healthy lung tissue and types of lung cancer, including adenocarcinoma, large cell carcinoma, and squamous cell carcinoma. Using these features, machine learning models classify cancer types with high accuracy.
How we built it
Data Collection:
Lung CT scans categorized into healthy lungs and three cancer types. Data split into training, testing, and validation sets.
Topological Data Analysis:
Extracted topological features using tools like Gudhi, Ripser, and scikit-TDA. Visualized persistent homology as persistence diagrams to capture variations in lung structure.
Machine Learning:
Trained a Support Vector Machine (SVM) and a Neural Network (NN) for classification. NN achieved a promising accuracy of 84%, showcasing its potential for automated cancer diagnosis. Challenges we ran into
Difficulty finding high-dimensional datasets compatible with TDA. Complexities in differentiating cancer types due to overlapping features. Computational demands of persistent homology for image datasets. Accomplishments that we're proud of
Successfully implemented TDA to reveal structural differences in lung CT scans. Achieved significant classification accuracy using a simple NN model. Demonstrated the viability of combining TDA with machine learning for cancer diagnostics. What we learned
TDA is a powerful tool for analyzing noisy and complex data like medical images. Persistent homology and Mapper are effective in identifying structural patterns. Combining TDA with ML enhances diagnostic precision in medical applications. What's next for this project
Enhance feature extraction by integrating ensemble learning methods. Improve model accuracy by including more consistent and robust features. Collaborate with medical professionals to further refine the approach and make it clinically viable.
Built With
- gudhi
- matplotlib
- python
- topology
Log in or sign up for Devpost to join the conversation.