Inspiration
At Rutgers University and in the immediate NJ community, pharmaceutical companies develop protein formulations at a larger density than anywhere else in the world. Furthermore, the RCSB PDB (Research Collaboratory for Structural Bioinformatics Protein Data Bank), which served as the training dataset for Google's Alpha Fold to win the Nobel Prize in Chemistry, is hosted at Rutgers University. As a researcher through the Rutgers Artificial Intelligence and Data Science Collaboratory, I work to develop generalizable models for protein formulation stability. Specifically, I work with polymer protein hybrids (PPH) to enhance protein stability, and a large portion of my time in lab is spent analyzing and simulating experimental PPH data. This tool serves as an initial screening tool, automating a large portion of the initial chemical design space experimentation process.
What it does
Upon searching a PDB structure name or uploading a file of the user's choosing, a 3D, rotating structure of the protein is visible to the user. To the right of that main visualization panel, the user is presented with a protein chemical design space overview, informing future monomer decisions. The user can also upload their own stability data for experiments run in their labs, creating an automated exploration of the chemical design space exhibited by the PPH. As an added bonus, an AI assistant trained on a RAG pipeline of polymer chemistry and protein papers can be called upon for real-time conversation and experimentation troubleshooting. The database for this RAG pipeline is continously updated. User information is stored via encrypted Auth0 architecture, ensuring PDB and PPH data are cached securely.
How it was built
Hosting: Github and HuggingFace GUI: Streamlit Voice-agent: Gemini 2.5 Flash, ElevenLabs Authentication/Login: Auth0 Proteins: RCSB Protein Data Bank
Challenges
I was unfortunately struggling with web deployment. Running a streamlit app on Github while maintaining secret keys proved very challenging. Once I overcame that issue, I then I tried to host on Digital Ocean, only to a failure of MLH credit redemption. Therefore, I turned to HuggingFace for model deployment. The demo video was filmed via local deployment. The AI chat assistant which is able to be called via ElevenLabs works perfectly, but the screen recording does not capture the assistant's audio. I attached a screenshot, as when the agent was filmed for the demo, there was a large portion of the recording with silence.
What's next for ProteinPRO: Protein Polymer Reactivity Optimizer
Initial predictability metrics will be updated with actual stability data.


Log in or sign up for Devpost to join the conversation.