Chromosome anomalies, including numerical and structural abnormalities, are primary indications for several genetic disorders. Numerical abnormalities arise from the gain or loss of an entire chromosome. This can be observed in several diseases such as Down syndrome, Turner syndrome and some types of Leukaemia.
In clinical practice, an important procedure for chromosome diagnosis is karyotyping. Karyotyping, a standard method for presenting images of the human chromosomes for diagnostic purposes, is a long-standing, yet common technique in cytogenetics. Karyotypes allow us to visually examine the patient's chromosomes and therefore predict the genetic disorder or possible abnormalities and suggest the most appropriate treatment.
In the NHS Genetic Diagnostic services, samples are taken from patients who have been referred with suspected disease. Slides are prepared from the given sample and images of chromosome spreads taken by the microscope are analysed. Geneticists use basic software to crop each chromosome on the spread and organise them in pairs from chromosome 1 to 23. Each chromosome is different and can be identified using special morphological features that are extremely difficult to identify to the untrained eye.
For each patient, about 20 cells need to be fully karyotyped, which is a very time-consuming process. This tool would greatly increase the efficiency of, Clinical Scientists, Genetic Technologists, and the NHS; allowing them to utilise their time in more urgent matters that are not simply mindless routine and take on more patients.
What it does
Receives an image of a person's cell showing their chromosomes, and automatically classifies each chromosome in the image.
How we built it
There are two main sections developed:
1) A function using MATLAB's Image Processing Toolbox to extract smaller, cropped, filtered images of chromosomes from a larger image of a cell interior.
2) A program using MATLAB's Machine Learning Toolbox to identify each chromosome given to it as 1-22, x, or y.
Challenges we ran into
Chromosomes that are found very close to each other in the image cannot be seperated by the image processing toolbox, as they appear as one object to the eyes of the computer. So a pre-processing step must be performed that involves cropping the image to just the chromosomes, coloring any large non-chromosome objects white, and drawing white lines between chromosomes that are almost touching. If there are any chromosomes overalapping, the image cannot be used.
It is possible to use MATLAB's Image Processing Toolbox to rotate the chromosomes so that they are vertical, using the length and width of the 'blob' found on the image. However, there is no way to tell whether the chromosome is the 'correct way up' or 'up-side-down' as described by Clinical Scientists.
There is no open database of clear, classified images of chromosomes to use with machine learning. Therefore, only about 100 images were used and features had to be decided by hand. If there was lots more pre-labelled data, the machine learning would have worked a lot better.
Accomplishments that we're proud of
Provided the pre-processing has been performed, the image processing function generally identifies chromosomes correctly and produces a clear image.
Machine Learning would have worked much better with much more data, there was probably nothing wrong with the code.
What we learned
Need much larger dataset for successful machine learning.
Image processing cannot differentiate objects that are very close together and about the same shade of grey.
What's next for Automatic Karyotyping System
Build a GUI in-between the image processing and machine learning that allows a human user to reject images and flip images the other way up.
Get a larger dataset to perfect the machine learning process