Inspiration
Diseases have a way of emerging and disrupting lives whenever they get the opportunity. In 1919, the Spanish Flu caused a devastating pandemic, and a century later, in 2019, the world faced another global crisis as COVID-19 led to lockdowns, shuttered businesses, and widespread impact on daily life.
Now, with the rise of HMPV (Human Metapneumovirus), we have developed a model that leverages existing knowledge bases, such as the Drug Target Ontology. This model identifies relationships between diseases by analyzing their semantic and structural connections, enabling it to uncover patterns and propagate insights effectively.
What it does
Its a model that helps uncover patterns between diseases whether those be finding ancestors, medicines. So, the user inputs a node(subject) and an attribute of interest(predicate) and the model predicts possible nodes(objects).
How we built it
We first found our ontology of interest DTO. Then, we performed some preprocessing steps on it to remove empty columns, splitting about '|' since that implies having or values. Then, we loaded it onto Neo4j and checked its basic connection via Modus.
We explored various methods to create embeddings for a knowledge base or ontology, experimenting with techniques such as DistMult, TransE, ComplEx, and ComplEx-GCN to generate knowledge graph embeddings. Among these, we found TransE to deliver the best results, so we decided to build our prediction model based on it.
Initially, we attempted to host our application on Hugging Face but faced challenges integrating it with Modus. As a result, we opted to host the app locally. The app takes input through a separate web interface, which redirects control to Modus. This setup utilizes our pretrained model to generate relevant results.
However, using these algorithms introduced a challenge of inconsistent mapping. To address this issue, we incorporated Neo4j for effective filtering, ensuring that only the most relevant and well-filtered nodes are added to the graph.
Challenges we ran into
Uploading data to neo4j from the csv found
Setting up Modus
Connecting AI model to Modus
Modus Connection with Neo4j
Handling cypher queries
Accomplishments that we're proud of
A knowledge base model that makes informed predictions and is free to use
What we learned
A lot about GNNs
How to handle embeddings for graphical data
What's next for Drug Investigator
Making a RAG and GAN and connecting more external knowledge bases for better and informed predictions which need less verification as they are backed by science
Built With
- assemblyscript
- django
- html
- javascript
- pykeen
- python
- torch
Log in or sign up for Devpost to join the conversation.