Visualization Tool for Mapping Genomic Intersections

Inspiration

We chose this topic after seeing the impact it could have on the lives of other people. Without reliable and efficient visualization tools for researchers to use, there is an increased in identifying and analyzing rare diseases. People often go years without having their disease properly diagnosed. If patients are diagnosed sooner, their treatments generally have better results. With this visualization tool, it will be easier for doctors and researchers to identify areas of interest within a patient's genome, contributing to an earlier diagnosis for patients.

What it does:

This visualizer helps show the overlap between different experimental trials. Each trial is its own BED file. The more overlap, the more confident that researchers can be that a particular physical interaction occurs in a laboratory experiment. In our visualization, we started with a local directory comprised of the provided BED files. In this program, you can upload an arbitrary number of BED files of genomic data for comparison.

The user can change the number of base pairs required to be considered a significant overlapping section of base pairs, the number of overlapping sections between different trials to increase the confidence of the results, and the chromosome the user wishes to analyze.

How we built it:

The backend of the project was developed via pyBEDtools, a python wrapper for the BEDTools program. We use pyBEDtools for the bioinformatic processing of our datasets. The front end of the project was coded the Plotly module in Python 3.6 and the Dash front end.

Challenges we ran into:

Our biggest problems were in managing the interactions between the user and the backend. It was a challenge to manage how the user will pass information to the server client to specify analysis and restrict results. We overcame this by learning how to use the advanced capabilities of the Dash ecosystem.

Accomplishments that we are proud of:

The team is proud of tackling a challenging problem and developing a novel solution independently. We are proud of being resourceful in our research by learning preexisting bioinformatics programs and leveraging them in our own solution. We are proud of gaining new knowledge in the various Python codes that we used.

What we learned:

We learned much about the processing of BED tools. We learned how to map overlaps, visualize locations of genomic regions of interest, develop interactive visualizations for the easy exploration of genomic data, and build interactive solutions for the focused study of said genomic data.

What's next for Visualization Tool for Mapping Genomic Intersections:

Next, we look to develop a user interface for the easy upload of multiple BED files to our server for server-side processing. Additionally, a Desktop application could be nice as it would be faster than a web interface. Our vision behind this application is an easy to use bioinformatics pipeline for the rapid processing of experimental data from overlap experiments. We hope this program is used to analyze significant overlaps in many BED files very quickly and accurately, to improve the efficiency of the research process. The increase in efficiency looks to decrease the turnover rate from data procurement to insight generation, and we hope this leads to faster diagnoses and discovering important genomic markers faster.