Begnin
malignant

LungOC – Lung Cancer Risk Prediction System

AI-powered lung cancer malignancy risk assessment tool combining deep learning with clinical risk factors UCSC BioHacks Project

Try out my project https://lung-oc.vercel.app/# to get images go down to the link below for Kaggle or here https://www.kaggle.com/datasets/akashnath29/lung-cancer-dataset/discussion?sort=hotness

(Jpg, or Jpeg data images work)

Inspiration

Lung cancer is the leading cause of cancer related deaths worldwide, largely because it is often detected too late. Most AI tools focus either on imaging or on patient history, but rarely both together. In real clinical practice, physicians combine radiological evidence with patient risk factors before making decisions.

I wanted to build a system that reflects that hybrid reasoning process. LungOC was inspired by the idea that AI should assist clinicians, not replace them especially in early screening and triage scenarios where timely risk assessment can make a difference.

What it does

LungOC is a hybrid AI-powered risk assessment system that:

Analyzes CT scan images using a convolutional neural network
Incorporates patient demographics and clinical risk factors
Produces a malignancy probability score
Outputs a final risk score categorized as Low, Moderate, or High
Provides recommendation guidance based on risk level

The system combines:

Deep Learning A CNN-based image classifier using a modified ResNet18 trained on lung CT scans.

Clinical Risk Assessment Risk factors included:

Age
Smoking history (pack-years)
Family history of lung cancer

Risk Stratification

final_risk = 0.7 × image_malignancy_prob + 0.3 × clinical_score

clinical_score = 0.01 × age + 0.02 × smoking_pack_years + (0.1 if family_history)

Example API response:

{ "prediction": "Malignant cases", "image_probability": 0.873, "final_risk": 0.742, "risk_level": "High" }

How I built it

Backend I built the backend using FastAPI for high-performance API handling and PyTorch for model inference. The model architecture is ResNet18 with a modified final layer to classify:

Benign
Malignant
Normal

The backend exposes:

GET /
GET /health
POST /predict *Used Superbase for security authentication to protect patients from hippa!

It accepts:

CT scan image (JPG/PNG)
Age
Smoking pack-years
Family history

Frontend I built the frontend using:

React
TypeScript
Vite
Tailwind CSS

The interface allows users to upload CT images, input clinical data, and instantly receive structured risk outputs.

Alternative Interface I also created a Streamlit version for rapid prototyping and demonstration.

Deployment

Backend deployed on Render (free tier)
Frontend deployed on Vercel or Netlify

Architecture

FastAPI + PyTorch backend
ResNet18 model (224×224 RGB input)
REST API integration
React frontend

Challenges I ran into

Model generalization CT datasets vary significantly in resolution, contrast, and labeling quality. I had to refine preprocessing pipelines to ensure consistent inference.

Balancing image and clinical risk Choosing the weighting between image predictions and clinical risk required experimentation. Too much emphasis on the model made clinical input irrelevant; too much emphasis on clinical factors reduced the power of the CNN.

Deployment issues Handling image uploads and model loading in cloud environments introduced memory and cold-start issues. I optimized model loading and ensured efficient API responses.

Frontend-backend integration Configuring CORS and properly formatting file uploads for POST requests required debugging during integration.

Interpretability Medical AI systems require trust. I focused on presenting outputs clearly rather than just returning raw probabilities.

Accomplishments that I’m proud of

Successfully integrated deep learning and structured clinical scoring into one working system
Built and deployed a full-stack AI application
Designed a clean, intuitive interface
Implemented risk stratification instead of simple classification
Created a functional REST API with real-time inference

I’m especially proud that LungOC functions as a hybrid decision-support system rather than a black-box classifier.

What I learned

Through this project, I learned:

How to deploy PyTorch models with FastAPI in production-like environments
How frontend-backend architecture affects performance and usability
How to calibrate and combine multi-modal inputs
The importance of interpretability and disclaimers in medical AI
How to design systems that balance technical accuracy with user clarity

I also learned that even strong CNN architectures require careful preprocessing and calibration when applied to medical imaging tasks.

What’s next for LungOC

Improve model performance Train on larger and more diverse CT datasets and explore medical imaging-specific pretrained models.
Add explainability Integrate Grad-CAM visualizations to highlight suspicious regions in CT scans.
Expand clinical inputs Include additional risk factors such as occupational exposure, previous respiratory disease, and genetic markers.
Add uncertainty estimation Provide confidence intervals alongside predictions.
Build a clinician dashboard Develop a more advanced dashboard for patient tracking and longitudinal risk analysis.
Pursue validation Explore IRB-backed validation studies and clinical benchmarking.

⚠️ Disclaimer This tool is a research prototype and not intended for clinical diagnosis. Always consult qualified healthcare professionals for medical decisions.

Publications & Useful Information:

1.https://www.cms.gov/medicare-coverage-database/view/ncacal-decision-memo.aspx?proposed=N&ncaid=304 2.https://www.kaggle.com/datasets/akashnath29/lung-cancer-dataset/discussion?sort=hotness 3.https://my.clevelandclinic.org/health/diagnostics/15031-lung-cancer-screening 4.https://my.clevelandclinic.org/health/diseases/14799-pulmonary-nodules

Built With

python

Updates

Heli Kadakia started this project — Mar 01, 2026 02:30 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.