Inspiration
We believed that is prompt would challenge us and encourage us to try new approaches. We also where interested in the the collagen gene and seeing if it has any conserved regions.
What it does
Our project is able to take in a protein sequences for a specific gene and view the multisequence alignment by 3D modeling and phylogenetic trees.
How we built it
We used Muscle for the multisequence alignment, Biopython and Trex-online for the phylogenetic tree generation, Pymol for the 3D model generation, SWISS-MODEL for making PDB file formats to make the 3D models. Uniprot helped us obtain the sequence alignment.
We did some file manipulation to make the sequences work on the online tools that we found. We generated the trees with biopython package that we learned.
Pymol was used to generate the 3D models and since the prompt reference a human disease we choose to align the models with the Homo sapiens to show the conserved regions. Since shape implies function we can say that there is a large conserved region shown by the output.
The species we used are: Ardeotis kori, C.brachydactyla, Callithrix jacchus, Felis Catus, Gorilla Gorilla gorilla01, Gorilla Gorilla Gorilla02, Grus americana, Homo sapiens, L.leucoptera, M.alba, Macaca mulatta, Orthonyx spaldingii, Panthera tigris altaica, Pelodiscus sinensis, Sarcophilus harrisii
Challenges we ran into
Automation of grabbing the PDB files from uniprot, sequence alignment (not having the correct sequence length), and since we could not automate the collection of PDB files we selected 14 plus 1 slightly identical file,
Accomplishments that we're proud of
We finished the goals of the project and learned new tools, and also did the option conserved region generation and we also did some team bonding.
What we learned
Kyle learned Pymol and generated the 3D models, Toby learned biopython and tree generation, Austin learned a little bit of biopython for multisequence alignment and is confident with Uniprot now and FASTA file minputation, John learned how to sequence alignment and biopython and another type of tree generation and a FASTA to PDB file.
What's next for Team 4
Future directions:
- better models using alphafold
- interactive trees and 3d models
- potential drug targets
- to further prove our hypothesis we would look at affinity
Log in or sign up for Devpost to join the conversation.