Eroom’s law states that the cost of developing a new drug doubles approximately every nine years. The process of drug discovery involves observing the effect on chemicals on human cells. Although much of the physical process of mixing the chemicals and taking pictures of its effect has been automated with robots, a major part of the work is identifying the nucleus of the cells from which further observation follows. It is currently being done manually by researchers and graduate students all over the world who painstakingly look at individual pictures and draw bounding boxes over them. Our tool seeks to free researchers from their misery by automating this process of identifying nuclei so that they can spend their time making observations that can advance drug discovery.
What it does
Our tool takes images of a collection of cells under different conditions as input and produces an image with bounding boxes and masks highlighting individual nucleus as output.
How we built it
We used tried and evaluated several techniques,
1) Region proposals (Come up with bounding boxes around nuclei of cells) 2) Instance segmentation (Classify each image as either belonging to a cell or not) 3) Watershed method 4) Otsu Threshold technique
We finally used Mask Regional Convolutional Neural Network to come up with candidate region proposals and train the network to select the most probable regions in the image that contain a nucleus.
Challenges we ran into
- Not enough annotated samples
- Lack of out of the box algorithms
- Noisy data
Accomplishments that we're proud of
- We achieved a 5% error rate by using U-nets
- By effective preprocessing, we were able to
- By effective preprocessing, we were able to overcome the limitation of less number of samples and train the neural networks faster
What we learned
- Neural nets take a long time to train.
- Complex networks like Mask R-CNN learn faster than simpler networks.
What's next for NucleID
- Combine Otsu's threshold with R-CNN to tighten the number of bounding boxes.
- Use an ensemble technique and take the intersection of the Region proposals and instance segmentation to come up with more accurate annotations.