Inspiration: Rare diseases affect millions of people worldwide, yet the vast majority remain without effective treatments. The drug development process for rare diseases is notoriously slow, expensive, and often deprioritized by pharmaceutical companies due to small patient populations. We were inspired by the idea that existing approved drugs could hold the key to treating these overlooked conditions
What it does RareRx is an AI-powered platform that predicts therapeutic targets for rare diseases and identifies opportunities to repurpose existing drugs. Given a disease, RareRx analyzes biological pathways, gene-target relationships, and clinical evidence to suggest which drugs already on the market could be viable treatment candidates. It reasons through mechanism compatibility, target-pathway overlap, safety constraints, and human evidence to produce a scored verdict — helping researchers and clinicians make faster, more informed decisions.
How we built it We built RareRx using Python as our core backend language, integrating data from biological databases including OpenTargets and ChEMBL to retrieve gene-disease associations, drug-target interactions, and pathway information. We used a large language model (LLM) via API to power the reasoning layer — enabling the system to assess mechanistic compatibility between a drug and a disease in natural language. The frontend was built using a web framework to make the tool accessible and easy to interact with.
Challenges we ran into One of the biggest challenges was data quality — biological databases often have inconsistent formatting, incomplete target lists, and duplicated entries (for example, the same gene appearing dozens of times in a single drug profile). Mapping drugs to diseases across multiple databases with different schemas was also non-trivial. Another challenge was calibrating the model's confidence — ensuring it didn't produce false positives by labeling a drug as a candidate without sufficient mechanistic evidence to back it up.
Accomplishments that we're proud of We're proud of building an end-to-end pipeline that goes from a disease name all the way to a scored, reasoned repurposing verdict. The system doesn't just retrieve information — it reasons through it, evaluating mechanism, pathway overlap, safety, and evidence tier together. We're also proud of applying this to rare diseases specifically, a space where computational tools can have real impact given how little research funding these conditions typically receive.
What we learned We learned how complex the drug-target-disease relationship space really is. Biological data is messy, and building reliable pipelines on top of it requires a lot of validation. We also learned the importance of SME (subject matter expert) evaluation — automated scores alone aren't enough, and human pharmacological judgment is essential to assess whether a model's reasoning is actually correct. This project gave us a much deeper appreciation for both the science and the engineering challenges in computational drug discovery.
What's next for RareRx We want to expand RareRx to cover a larger catalog of rare diseases and integrate additional data sources such as clinical trial registries and genomic databases. We also plan to improve the scoring model by incorporating more evidence tiers and reducing false positives. Longer term, we envision RareRx as a tool that researchers, rare disease advocacy groups, and even clinicians can use to accelerate the path from hypothesis to treatment for patients who have no other options.
Built With
- bfloat16
- biopy
- colabt4/v100
- google-drive
- huggingfacetransformers
- jupyter
- lora
- lucidereact
- metallama3.1-8b-instruct
- networkx
- next.js16
- numpy
- pandas
- papaparse
- peft
- python3.9+
- qwen
- react18
- scikit-learn
- tailwindcss
- typescript
Log in or sign up for Devpost to join the conversation.