Gene Academy

Inspiration

Imagine you're a young kid going to your first day of school. You're super excited, but also very anxious since you have no idea how everyone will treat you. You come with your parents. Looking around, you look around to find your friends. You see your friend, Marissa, across the room and head over to her. "Why do you have green eyes? Your parents have brown eyes."

Shock flashes across your face as you realize she wasn't lying. The doubts you've always had of whether your parents are your biological parents come to the surface.

The next few weeks are crazy. You talk very little to your parents, causing them to become increasingly worried. You finally decide to go online to determine how this could be possible but there is no easy source to figure this out. In addition, everything is too complicated to understand. Ten years ago that kid was Lomeli.

Today, access to genetic information has become exponentially cheaper than before. Massive datasets are available online to the public but it seems like you’d need a PhD just to understand it. We want to democratize access to information on genetics to the millions of people who need to understand more but don’t have the means to do so.

Often times, people worry about where they come from due to their phenotypes and are curious about how they work. Gene Academy provides the solution.

What it does

Gene Academy is an app that helps you learn about genes. We have a database with which you can discover information about specific or random genes. Otherwise, we have an avatar you can use to edit the phenotypes, specifically skin color, hair color, eye color, hair curliness, and freckles. As you change the phenotypes, you can see the changes in the genotypes. You can read the specifics of how the phenotypes are affected by genes.

How we built it

We began our project by using applications such as Sketch to create assets, but the bulk of the programming was done through Android Studio, an IDE for Android Development that ports Java code to the mobile phone.

To access the gene database, we used the scripting language Python to make a "data-miner" that traversed through the website from the NIH and obtained relevant information about the over 1300 genes available, such as function, location, and a picture of the location.

For the phenotypes, we conducted extensive research to learn about the genes that affected the phenotype. We then created concept maps to visualize the information of how the genotype affects the phenotype.

To present the genetic information in both a comprehensive and easy-to-digest manner, we created buttons and drop-down menus that would change an avatar's appearance based on user input as well as present the information retrieved from the NIH website's database.

Now that we had retrieved the information from the database (and stored them in text files), all that was left to do was put everything in place, which our teammate Cristian Lomeli did expertly.

Challenges we ran into

At first, we were unsure of where to obtain assets for our visual avatars, but eventually found several github repositories which had graphical representations of avatars.

At this point in time we were unsure of whether to pursue web or mobile development. Unfortunately the repository we found had no documentation and terrible coding style. We spent several hours attempting to understand the CSS behind the assets we were using, but after running into some difficulties with understanding the large amount of CSS and Javascript files, we decided to move to Android development.

One of the most interesting challenges came in the form of creating the user interface: we had all these options for the user, but as we were developing it, it seemed more and more likely we would need to copy code over and over again, which would reduce flexibility and increase chances for errors. Two of our teammates figured out that they could simply write a function that returns the relevant information instead of needing to copy and paste code over and over again, which saved about an hour of work.

A challenge we ran into on the scripting side was twofold: 1) How to find where the data was located, and 2) How to actually retrieve the data. After some discussion, we decided it would be best to use the scripting language Python, using libraries such as requests and BeautifulSoup , to gather information. Luckily enough, the information on the NIH website was organized in a very structured manner (First, the genes were sorted by first letter, then by name, and the urls reflected this, making it very easy to gather information).

One last challenge we ran into was memory allocation. We decided to include all the NIH information in .txt and .png files (to not need internet access to gather information), and while accessing these files, we would frequently run out of memory. What led to this was that were were loading all 1300 chromosome images at once and simply changing their transparency! To fix this, we simply showed chromosomal images when they were needed to appear.

Accomplishments that we're proud of and What we learned

Learned a LOT about genetics and bioinformatics
Built relationships with experts in the space who could give us mentorship
Solved a problem which our own teammate Lomeli experienced in his childhood

What's next for Gene Academy

Utilize machine learning and NLP to create an overview for the genes from the details
Compare two avatars and their genes
Create an avatar from a photo to analyze genes directly from a photo
Create a quiz-based game from the gene database
Ship to the Play Store and help millions understand how genes work