Inspiration

Waiting and waiting for genetic answers to our most difficult disorder questions can halt a family's life. As they wait for genetic researchers to process the genomic data, their loved ones undergo test after test. Some invasive and useless, because who can sit around that long waiting for an answer as they continue to get worse. A part of this genomic data processing, that contributes to the waiting period, is the comparison of fragment overlap in .BED files. Overlap data associated with reported interactions of enhancers to promoters in a given gene. Today, the tools available allow for only two to three .BED files to be compared for overlap in a day's time. Limiting researchers in the amount of experiments that can be run to get answers. GenoSense has developed a tool that can significantly scale the number of files and reduce the time to compare fragment overlap in .BED files. In addition, it visualizes the overlapping data between experiments. Getting families faster answers, so life can go on.

What it does

Using a graphical user interface (GUI) a user can open multiple .BED files in GenoSense, visualize each individual file and then have the program analyze the fragment overlaps between the files. The output is a visualization of the overlaps. The overlap values (positive and/or negative) are inputs by the user directly in the GUI. GenoSense achieves this in an application that can be immediately used by user without hours of training.

How we built it

GenoSense is a Python based, MIT licensed, application which can run across computing platforms (Windows, Mac/Linux, ChromeOS, etc) from a user's desktop.

GenoSense is the comparison of BED file fragments and computes the overlap of fragments. We want to know how if multiple chromosone fragments overlap a single time (many-to-many) and if single fragments is overlapped multiple times (single-to-many). We compute the overlap of two single fragments and store two single fragments for graphical visualization. These overlapping regions can be "expanded" via overlap minimum and constraining based on the gap minnimum and In our future design, we will include implementation many-to-many overlap and single-to-many overlap.

Challenges we ran into

Understanding the problem at hand took the team time. It's not just about the math associated with the fragment overlap. Incorporating both positive and negative overlap bases as inputs for the comparison analysis required the team to refactor the code in order to not limit the program to be required to be run separately for length of overlapping segments and gap

Accomplishments that we're proud of

A graphical user interface to analyze .BED files and visualize the overlapping fragment between 2 .BED files in 30 seconds given user base overlap (+/-) minimum.

What we learned

We have a better appreciation for the difficulties that families affected by the genetic disorder in finding the answers for their families. Useful applications require iterations of code, releasing current code for use and learning what works through user feedback to iterate and release more code. Constantly improving the application, but not waiting until everything is capable in order to start getting answers sooner and get faster and improved answers every release.

What's next for GenoSense

Our application will be run by various genetic researchers to ensure that all combination of .BED files are able to be analyzed. Upcoming releases will include:

  • Call outs that will popup on the overlap results visualization, showing the user the number of overlap bases for a selected section.
  • Adaptive input of number of overlaps ("k" value)
  • Move to parallel computational programming
  • Algorithm and Framework optimization (numpy, Cython) Collaborating with HudsonAlpha researchers to improve their daily work will be key to continue to give Faster Answers...so Life can Go On.

Built With

Share this project:

Updates