Inspiration
We envision a future where allergies can be cured through a vaccine by focusing on tolerance induction. Our project centers on epitope research—assessing the likelihood of a peptide to activate B-cells or T-cells. Our aim is to find peptides that avoid triggering B-cells (which cause allergic reactions) while engaging Tregs to promote immune tolerance. Building prediction tools to identify these peptides became our mission.
What It Does
Features:
- Epitope Predictions: Conformational B, Linear B, Class I T, and Class II T epitopes.
- Sequence and Structure Support: Accepts PDB ID with chain selection.
- Prediction Method Selection: Supports different epitope prediction methods.
- Peptide Analysis: Breaks down proteins into peptides for detailed analysis.
- Job History & Sharing: Tracks prediction jobs, sharing options, sortable data tables, and downloadable results.
- Conformational Epitope Visualization: Visual representation for conformational B-cell epitopes.
Tools:
Class I T-Cell Epitope Tool:
- Inputs: Protein Sequence, Class I Alleles, Prediction Methods.
- Outputs: Peptide Sequence, TCR Recognition Probability, MHC Binding Affinity (per allele), pMHC Stability (per allele).
Class II T-Cell Epitope Tool:
- Inputs: Protein Sequence, Class II Alleles, Prediction Methods.
- Outputs: Peptide Sequence, TCR Recognition Probability, MHC Binding Affinity (per allele), pMHC Stability (per allele).
Linear B-Cell Epitope Tool:
- Inputs: Protein Sequence, Prediction Methods.
- Outputs: B-Cell Immunogenicity, BCR Recognition Probability.
Conformational B-Cell Epitope Tool:
- Inputs: Protein Sequence, PDB ID, Protein Chain, Prediction Methods.
- Outputs: Residue Details (Residue position and amino acid, e.g., 23:K), RSA, ASA, n-glycosylation, BCR Recognition (work-in-progress), Surface Accessibility.
How We Built It
We built the web platform using the T3 stack (Next.js, Supabase, TypeScript) combined with FastAPI for backend functionality and AWS for infrastructure. We drew inspiration from the design philosophies of IEDB and SEMA 2.0. By leveraging industry-leading research papers, we calibrated our approaches and innovated wherever we saw opportunities.
Challenges We Ran Into
IEDB API Rate Limits: We faced restrictive rate limits with the IEDB Tools API, but managed to secure access from DTU Health Tech using an academic email, hosting the models on EC2.
ESM3 Fine-Tuning Issues: We had trouble converting the fine-tuning approach used by SEMA from ESM2 to ESM3. SEMA used:
_, esm1v_alphabet = esm.pretrained.esm1v_t33_650M_UR90S_1() self.esm1v_batch_converter = esm1v_alphabet.get_batch_converter()We attempted using PreTrainedTokenizerFast and even building a custom protein sequence tokenizer, but couldn’t complete it in time.
SageMaker Deployment Issues: SageMaker does not accept OCI-formatted ECR images, which caused delays. Docker was defaulting to OCI format, requiring a third-party tool for conversion. Additionally, ESM3 dependencies (
esm, 6GB) significantly increased our Docker image size, resulting in long build times (over 30 minutes) and preventing successful deployment of our TCR prediction model within the hackathon timeframe. We weren't sure how to send requests to the API and not need the client to save space.
Accomplishments That We're Proud Of
- Successfully built a functional web platform for running epitope analysis jobs.
- Ventured into TCR recognition prediction, an underexplored area, achieving competitive performance compared to other tools in the field.
What We Learned
- Gained deeper insight into the Bio x ML space and the application of generative biology models.
- Recognized significant untapped potential in epitope prediction, particularly for applications in drug discovery for immune-related diseases.
What's Next for the Epitope Prediction Web Platform
- Finish Fine-Tuning ESM3: Complete the fine-tuning process and deploy all prediction models.
- Production-Ready Platform: Continue refining and stabilizing the web platform for production use.
Built With
- amazon-ec2
- api-gateway
- colab
- docker
- drizzle-orm
- ecr
- esm2
- esm3
- fastapi
- lambda
- lambda-api-gateway
- netmhciipan
- netmhcpan
- next.js
- openapi-ts
- python
- sagemaker
- sklearn
- supabase
- tailwind
- trpc
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.