NeuroGAN: Bridging Class Gaps with Targeted Augmentation

Elevator Pitch

NeuroGAN utilizes GAN-based data augmentation to transform a minority class of only 49 samples into a high-precision training set. This strategy achieved a 95.26% overall accuracy and a perfect 1.00 F1-score for the minority class. We eliminate diagnostic blind spots by ensuring rare data is classified with 100% precision.

The NeuroGAN Story

Inspiration

The inspiration for this project was the challenge of data scarcity in specialized classification tasks. In many datasets, certain classes are naturally rare, leading to extreme imbalances that cause models to ignore minority cases. We set out to prove that targeted data augmentation could bridge this gap, ensuring that infrequent occurrences are identified with the same precision as common ones.

How We Built the Project

The project was built using a multi-stage pipeline focused on balancing a dataset where the minority class (Label '1') initially had only 49 occurrences.

  1. Preprocessing: We isolated the minority class into a dedicated minority_class_df. Images were resized to 64*64 pixels and normalized to a range of [-1, 1].
  2. Augmentation: We utilized GAN-based data augmentation to synthesize new samples,
effectively balancing the training distribution.
  1. Model Architecture: A CNN was trained over 10 epochs using the combined dataset.
  2. Evaluation: Performance was measured on a validation set, tracking categorical cross-entropy loss and weighted F1-scores.

Challenges Faced

● Extreme Imbalance: The label distribution was heavily skewed, with Class '2' having
2,566 samples while the minority Class '1' had only 4913.
● Technical Warnings: We resolved a SettingWithCopyWarning by explicitly copying the
DataFrame and fixed a Deprecation Warning regarding image mode inference in
Image.fromarray().
● Class-Specific Accuracy: While the minority class reached perfection, Class 0
presented a challenge with a lower F1-score of $0.91$ due to misclassifications toward
Class 2 and Class 3.

What We Learned

Targeted augmentation is highly effective. By the end of training, the model achieved: ● An overall accuracy of 95.26%. ● A perfect 1.00 F1-score for the minority class, with zero misclassifications out of 198 instances. ● The training loss decreased from approx 0.8346 to approx 0.0521.

Built With

Share this project:

Updates