U-Net with advanced encoder and decoder

We are tasked to provide the segmentation of cell boundaries within mouse brain by training a model on the dataset including fruit fly brain image.

In this model, we trained on fruit fly images and mouse brain images found online and validated the model on the provided mouse brain images and masks to select the model and evaluate the performace of this model based in IOU metric, which is defined as

\begin{equation} true positive / (true positive + false positive + false negative) \end{equation}


  • Fruit fly image and segmentation mask pair

Data augmentation
In order to generalize better, we applied data augmentation techniques, which are proven to be effective on biomedical image data.

  • Random elastic deformation
  • Random shear (rotation, shifting and scaling)
  • Random contrast and brightness change flip Then, in order not to have blank space as a result of these transformation, we randomly cropped the images to 256 x 256.

Test and Validation
mouse brain image
As a baseline model, we adopted U-Net, which is characterized with encoder and decoder architecture, because this model is proven to be work well with biological images. The Intersection of Union metric was used to evaluate performance.

In order to improve the performance we incorporated couple of changes below.

  • Modified the encoder to Deep Residual Pyramid Net
  • Incorporated spatial and channelwise squeeze and excitation block
  • Optional: shakedrop regularization technique to see if it generalizes well

The provided fruit fly data doesn't allow for effective generalization to the mouse data with this model. This was verified by training the model with different percentages of the mouse used as training data (and not used as validation data). Results including randomly selected samples can be seen below.

Best IOU: 70%

While we have good result for training set, we did not see that the output on test images are good. This is presumably because the distribution of the dataset is so different.

Actually, we observed that when we included the test images in the training set, the performance improved dramatically.

We tried to address this issue by incorporating further regularization techniques, suchg as shakedrop architecture.

However, we did not observe significant improvement from there.

Challenges and future direction
The biggest challenge is the poor generalization.

Share this project: