Quantum DNA Analysis

InspirationHello! Our names are Arjun Bakhale and Sasanka SN. In today's world, with a growing need for medical care, we have created an extremely efficient and useful solution. We present our DNA analysis program!

Our DNA analysis program makes use of the capabilities of quantum computers to process the immense amount of data that arises from studying DNA. To process thousands of genes, each of which contain hundreds of characters, can lead to data that occupies hundreds of thousands of bytes. This can be seen in the genome of Salmonella Paratyphi A, which has 4381 genes! This, on top of processing power, tasks, and regular OS and UI sustenance, lead to many inconveniences in trying to process this essential data with traditional computers. That is where quantum computers come in!

A quantum computer’s immense speed is due to the application of theories of superposition, which allow a quantum computer to process large amounts of information simultaneously. This “classical” program is run on, and further integrated with quantum algorithms, to aid users in processing the massive amounts of data that is imperative for further research. 

How this program works is it simulates the intra-cellular process of protein synthesis. In protein synthesis, the DNA helix in the nucleolus unwinds, and the one the strand traditionally represented as the bottom or left strand is used to create mRNA (messenger RNA). RNA is a single strand version of DNA and it uses the bases Adenine (A), Uracil (U), Guanine (G), and Cytosine ( C), whereas DNA uses Thymine (T) instead of Uracil. A only pairs with T (or U in RNA) and G only pairs with C. After the DNA is unwound, an RNA strand is compiled with the complement bases to the DNA strand and is sent out of the nucleus. The process of compilation is called transcription. The mRNA is then sent out into the cytoplasm of the cell and “translated” by tRNA in sets of three bases, codons, in the ribosome which takes each codon and uses it to identify a specific amino acid which is produced. The amino acids go on to manufacture proteins, the basic building blocks of life, which go on to accomplish various different functions in cells. 

The program allows the user to upload a .txt file that contains the complete DNA sequence as it would be in the nucleolus. The user then calls a function to transcribe the DNA into mRNA, stores it in a string variable, and then calls a function to translate it into codons, printing out a list of amino acids produced in the order they were coded for in the DNA strand. Lastly, the user can call the check function, which takes two strands of DNA that have the same length as parameters, and compares the second strand to the first, checking for mutations. If there is a mutation, it prints out what kind and the location of the base where the mutation begins.

Our program will greatly aid scientists trying to unravel the mysteries of DNA, companies that may provide commercial DNA analysis, aiding students in learning about protein synthesis (allowing them to input base sequences and see the results), and research projects in which DNA must be processed. This can also be integrated with software that may be able to track synthesized proteins and map their influence on the cell and organism, like in cancer research where mutations and problems in protein synthesis. This software will greatly aid many groups and assist in the continued study of DNA and its mysteries and potential.