Problem
Rare disease patients often wait years for a diagnosis, and even after one arrives, the path to a therapeutic strategy is buried across dozens of disconnected databases. Clinicians and researchers have to manually cross-reference OMIM, ClinVar, UniProt, PubMed, gnomAD, and Open Targets just to build a basic picture of a gene's therapeutic potential. We wanted to collapse that process into seconds. RareSight was inspired by the idea that a physician holding a patient's sequencing report shouldn't need a bioinformatics team to answer the question: what do we do next?
What it does
RareSight is a gene therapy target analysis and validation platform for monogenic rare disease. Given a gene symbol and an optional patient variant in HGVS notation, RareSight automatically queries six major biological databases and returns a structured clinical report including disease mechanism classification (gain-of-function, loss-of-function, dominant negative), therapy modality recommendation (AAV gene replacement, ASO/siRNA silencing, CRISPR editing, mRNA therapy), CRISPR guide RNA candidates with quality scores ranked by proximity to the variant, patient variant resolution against ClinVar with protein-level feature intersection from UniProt, gnomAD population frequency, allele-specific ASO discrimination feasibility, literature evidence tier, known drugs from Open Targets, and pathway analysis. Therapy-target matches can then be converted into probable validation assays based on existing research and target analogs. To complete the pipeline, users can translate the generated assay into robot-readable code to run the assay on an Opentrons Flex liquid handling robot for fully autonomous validation.
Stack
The backend is a Python Flask application wrapping a custom pipeline that orchestrates calls to OMIM, MyGene.info, UniProt, ClinVar (via NCBI Entrez), Ensembl REST, Open Targets GraphQL, gnomAD GraphQL, and PubMed. The pipeline uses dataclasses to build a structured AssayParameters object that captures everything from protein length and AAV feasibility to candidate guide RNA sequences. The frontend is a single-page HTML/CSS/JavaScript interface that renders the report interactively in the browser and sends the structured JSON to a ReportLab-powered PDF export endpoint on the backend.
Challenges we ran into
Harmonizing data across six databases with different schemas, rate limits, and reliability was the biggest challenge. HGVS variant notation is notoriously difficult to parse consistently, extracting amino acid positions from cDNA notation requires accounting for intronic offsets that make genomic coordinates only approximate. gnomAD's GraphQL API required careful query design to handle variants absent from the database gracefully. We also had to carefully handle cases where OMIM returns no mechanism signal, or where ClinVar has no exact match for a patient variant.
What's next for RareSight
We want to add support for Cas12a (TTTV PAM) and base editor window validation so guide recommendations are more precise. A natural next step is integration with SpliceAI to assess exon-skipping feasibility for splice-region variants. We'd like to add a patient case history module so clinicians can track multiple variants across a family. Longer term, RareSight could serve as the front end for an automated assay parameter generator; taking the structured output and producing ready-to-order oligonucleotide sequences for ASO or siRNA synthesis
Log in or sign up for Devpost to join the conversation.