Pneumonia Detection Using CNN + LSTM Hybrid Model

This project implements a hybrid deep learning model to detect pneumonia from pediatric chest X-ray images. By combining Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks, the model captures both local spatial features and global contextual dependencies across lung regions. Built using TensorFlow/Keras, it includes visualizations, performance metrics, and a user-friendly interface for real-world deployment. Dataset

We use the Pediatric Chest X-ray Dataset from Kaggle, which includes labeled chest X-ray images categorized as:

NORMAL
PNEUMONIA

Folder Structure:

Pediatric Chest X-ray Pneumonia/
├── train/
│   ├── NORMAL/
│   └── PNEUMONIA/
├── test/
│   ├── NORMAL/
│   └── PNEUMONIA/

Deep Learning Architecture Convolutional Neural Networks (CNNs)

CNNs extract spatial features such as edges, textures, and shapes from X-ray images.

Key Components:

Conv2D: Learns visual patterns
MaxPooling2D: Reduces dimensionality
ReLU: Adds non-linearity
He Initialization: Stabilizes gradient flow

Why CNN? CNNs are ideal for medical imaging tasks due to their ability to learn hierarchical spatial features. Long Short-Term Memory (LSTM)

LSTMs capture long-range dependencies across reshaped image sequences.

Key Components:

Memory Cell: Retains historical context
Gates: Control information flow
LSTM Layer: Processes feature sequences

Why LSTM? By reshaping CNN outputs into sequences, LSTMs can learn spatial relationships across lung regions—critical for detecting pneumonia spread. CNN + LSTM Hybrid Model

Architecture Flow:

Input Image (224x224x3)
↓
CNN Layers → Feature Maps
↓
Reshape → Sequence Format
↓
LSTM Layer → Contextual Understanding
↓
Dense Layers → Classification
↓
Softmax → Output: NORMAL or PNEUMONIA

Advantages:

CNN: Captures local lung features
LSTM: Understands global spatial dependencies

Model Compilation & Training

Training Details:

Loss Function: sparse_categorical_crossentropy
Optimizer: Adam
Metric: accuracy
Normalization: Pixel values scaled to [0, 1]
Epochs: 20
Validation Split: 20% Evaluation Metrics

Metrics Used:

Accuracy: Overall prediction correctness
Confusion Matrix: TP, FP, TN, FN breakdown
Classification Report: Precision, Recall, F1-score

Visualization

Tools Used:

Matplotlib: For plotting training curves
Seaborn: For heatmaps and confusion matrix

Grad-CAM Integration:

Highlights lung regions influencing predictions
Builds trust and interpretability for clinicians Summary of Key Concepts

Concept	Purpose
CNN	Extract spatial features from X-ray images
LSTM	Understand spatial dependencies across image regions
Reshape Layer	Convert CNN output to sequence format for LSTM
Dense + Softmax	Final classification into NORMAL/PNEUMONIA
Image Normalization	Scale image pixels to [0,1] for stable training
Model Evaluation	Accuracy, confusion matrix, precision, recall
Visualization	Helps interpret model performance visually
Grad-CAM	Visual explanation of model predictions

The app lets users export a detailed PDF report containing patient metadata, prediction results, and visual Grad-CAM overlays. This feature supports clinical documentation and easy sharing with healthcare professionals or patients. With one click, users can securely download the report for offline review or integration into medical records.

Built With

python
streamlit
tensorflow

Updates

Gandhiraj J started this project — Aug 24, 2025 12:32 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.