Inspiration

Adverse drug reactions (ADRs) are a major cause of hospitalizations, drug recalls, and regulatory warnings. Traditional drug interaction databases are limited, static, and rule-based, missing complex biochemical interactions. Inspired by the need for a more dynamic, explainable, and AI-driven approach, we developed a system that learns from real-world data, retrieves scientific knowledge, and predicts risks more effectively.

What it does

✅ Predicts side effects from existing drugs using PubChem & SIDER datasets. ✅ Analyzes drug combinations to forecast potential adverse interactions. ✅ Uses chemical laws & Retrieval-Augmented Generation (RAG) to filter impossible interactions before AI processing. ✅ Incorporates molecular structures, body fluid reactions, and toxicity insights to improve accuracy. ✅ Provides interpretable explanations for predictions using real biochemical mechanisms.

How we built it

1️⃣ Data Integration: Combined SIDER side effects, PubChem molecular data, and drug interaction datasets. 2️⃣ Rule-Based Filtering: Applied known chemical interaction laws (charge, polarity, metabolism) to rule out unlikely interactions. 3️⃣ Retrieval-Augmented Generation (RAG): Pulled relevant scientific knowledge from FDA, PubMed, and biomedical literature. 4️⃣ Graph Neural Networks (GNNs): Modeled drug interactions as molecular graphs to predict unknown interactions. 5️⃣ Hybrid Model Execution: If RAG finds a known interaction, we skip AI processing to improve efficiency.

Challenges we ran into

🔹 PubChem API Limitations – Frequent server busy errors required rate-limiting, retries, and synonym lookups. 🔹 Sparse Data Problem – Many drug combinations lack labeled interactions, requiring semi-supervised learning. 🔹 Complex Molecular Graphs – Scaling graph neural networks to handle high-dimensional biochemical structures. 🔹 Interpretable AI – Ensuring AI explanations align with real-world chemical knowledge.

Accomplishments that we're proud of

✅ Successfully integrated AI, RAG, and rule-based filtering for drug safety analysis. ✅ Optimized drug interaction prediction by eliminating non-relevant pairs before AI processing. ✅ Achieved interpretable AI predictions backed by real biochemical pathways. ✅ Reduced training time by up to 40% using rule-based pre-filtering.

What we learned

🔬 Scientific knowledge + AI = Best results – Pure deep learning is insufficient; chemical laws improve interpretability. 🔬 RAG enhances AI pipelines – Retrieving scientific evidence before predicting drug interactions improves accuracy. 🔬 Efficient data processing matters – Rate limits and API optimizations are crucial for real-world applications. 🔬 Hybrid AI approaches scale better – Combining rules + retrieval + AI outperforms black-box deep learning.

What's next for AI-Powered Drug Interaction & Side Effect Prediction

🔹 Expand dataset coverage – Integrate more toxicity studies, clinical trials, and real-world drug use cases. 🔹 Real-time API for drug safety checks – Allow pharmaceutical researchers to query interaction risks dynamically. 🔹 Explainable AI dashboard – Visualize drug interactions, molecular mechanisms, and confidence scores. 🔹 Integration with drug discovery pipelines – Improve preclinical drug testing by forecasting adverse reactions early.

Built With

  • drug-interaction-databases-?-api-&-data-retrieval:-pubchempy
  • github-?-development-environment:-conda
  • hugging-face-transformers-?-graph-neural-networks-(gnns):-gcnconv-(graph-convolutional-networks)-?-cloud-&-compute:-aws/gcp-(optional-for-scaling)
  • jupyter-notebook
  • local-mac-m4-max-pro-?-version-control-&-collaboration:-git
  • numpy-?-drug-data-sources:-pubchem
  • programming-languages:-python-?-machine-learning-frameworks:-pytorch
  • pytorch-geometric-(gnns)-?-data-processing:-pandas
  • requests-?-knowledge-retrieval:-retrieval-augmented-generation-(rag)-using-facebook-rag
  • sider
Share this project:

Updates