Butylated hydroxyanisole (BHA) is a potent antioxidant commonly found in food preservatives. Both the National Institute of Health and The International Agency for Research for Cancer believe BHA is a carcinogen. We wanted to find an alternative.

What it does

Went through a massive amount of molecules looking for alternatives for BHA to help prevent cancer.

How we built it

Using a jupyter notebook, we sifted through 1.9 million different molecules. We had to perform molecular fingerprinting on each molecule and then perform a tanimoto similarity between these molecules with Rdkit (cheminformatics library) and BHA to find a shortlist of 100 alternatives. We created a custom data structure and algorithm mix of a max-heap and timsort in python to maximize efficiency and prevent our computers from frying.

Challenges we ran into

We never used any of these libraries, jupyter notebooks, anaconda (which was required for rdkit), or postgresql to go through the database so there were points where we were a little confused but a mentor helped us out :)

Accomplishments that we're proud of

Honestly outside python (and even that for 2 members) everything we did was new to us. We learnt a lot of new information doing this project related to the field of cheminformatics.

What we learned

Python, rdkit, cheminformatics, postgresql, jupyter notebooks, tanimoto similarity.... basically everything in the project

What's next for Alternative to BHA

Probably some lab work. After we analyze the organic molecules to further shortlist the 100 molecules, it would be great to test these molecules in a lab!

Also with my domain from I'm going to start a wacky food blog at: :) Give me a couple days to setup and it'll be rolling!

Built With

Share this project: