Inspiration

When I was searching for ideas all over the web I went through the research paper of gpt-oss model where there were many applications of it in different field. There they mentioned about medical field too. I gave it a thought if becomes the research assistant for the diseases which don't have any cure available. So that scientist know in which direction they should to find cure. Parkinson is the disease from which my grandfather is suffering from and there is no cure to it. For him this disease started at very old age but there are many people who suffering from such chronic diseases which don't have any cure. I hope this could bring a change in medical science.

What it does

At its core, Synapse automates the process of reading and connecting vast amounts of scientific literature to uncover insights that a human researcher might miss. It operates in a three-step process:

Ingests Data: It automatically fetches and reads the abstracts of hundreds of recent medical research papers from databases like PubMed based on a specific topic (e.g., "Parkinson's and the LRRK2 gene").

Builds a Knowledge Graph: Using a powerful language model, it extracts key concepts (like genes, proteins, drugs, and diseases) and the relationships between them (e.g., "inhibits," "promotes," "treats"). It then pieces this information together into a large, interconnected network of knowledge, often called a knowledge graph.

Generates Novel Hypotheses: This is the project's key feature. Synapse analyzes the knowledge graph to find "hidden connections"—indirect links between two concepts that no single paper has explicitly stated. It then uses these connections to formulate new, testable scientific hypotheses, suggesting promising directions for future research.

How we built it

The project was constructed in Python using a modular five-step pipeline: -Data Collection: Using the pymed library to fetch research paper abstracts from PubMed. -AI Extraction: Sending the abstracts to a large language model gpt-oss-120b medical concepts and relationships into a structured JSON format. -Knowledge Graph: Using the networkx library to assemble the extracted information into a graph where concepts are nodes and relationships are edges. A critical feature is the "Trust Layer," which links every piece of data back to its source paper. -Hypothesis Generation: Analyzing the graph to find indirect connections and using the LLM again to formulate them into clinical hypotheses. -User Interface: Using streamlit to create an interactive dashboard that orchestrates the entire process and displays the results.

Challenges we ran into

The primary challenge discussed was insufficient system configuration. Running large language models and processing extensive knowledge graphs are resource-intensive tasks that require significant GPU VRAM and system RAM.

Accomplishments that we're proud of

accomplishments are not yet being achieved until and unless it proofs to be useful for the human kind. On the personal level I am proud of creating a full end-to-end AI pipeline for a complex real-world problem.

What we learned

The project served as a hands-on tutorial in several areas: -Integrating multiple Python libraries to build a full-stack application. -Applying advanced prompt engineering, understanding the power of knowledge graphs, and using AI for novel scientific discovery.

What's next for Synapse

-Enhancing the tech with an interactive graph, more diverse data sources (like ClinicalTrials.gov), and a fine-tuned specialized AI model. -Adding scientific features like automated drug repurposing analysis and hypothesis novelty scoring. -Seeking real-world validation by collaborating with a research lab to test the AI-generated hypotheses experimentally.

Built With

  • gpt-oss-120
  • matplotlib
  • networkx
  • pymed
  • python
  • streamlit
Share this project:

Updates