Inspiration
Between twenty-five and thirty million Americans are affected by approximately 7,000 rare diseases (as cited from National Institutes of Health). Yet, of these 7,000 known diseases, around 95 percent still don’t have treatment. Even worse, many physicians are not sufficiently trained in awareness of rare diseases during their medical education—in fact, each patient visited an average of six physicians before consulting an expert. These numbers pose dangerous implications: exactly how many Americans suffering from rare diseases are getting left untreated, and how life-threatening is late diagnosis? Consequences for patients can range from resulting in up to $517,000 in avoidable costs per patient (Everylife Foundation for Rare Diseases and the Lewin Group) to prolonged delays in receiving holistic care, which in turn may exacerbate symptoms, impairments, and cause emotional and physical burden on the patient. Our project, MedConnect, aims to reduce these outcomes by streamlining the connection between physicians and specialists. Since most physicians will not encounter every rare disease out there, our web-based application will allow doctors to easily connect their patients to specialists: reducing financial burden, alleviating emotional suffering, and encouraging advancement in medical research and understanding of rare diseases.
What it does
At its core, MedConnect makes connecting physicians and patients to specialists easier than ever. On the physician’s end, all that needs to be entered is the name of the disease into our extremely easy-to-read UI. Then, our application searches through our database of clustered topics, filters by relevancy score (how closely their specialty aligns with the physician’s input), and displays information about the specialist. The physician gains knowledge about their location, labs, contact information, and specialties/sub-specialties.
How we built it
- Extracted 11,000+ rare disease records from the NORD Rare Disease Database to establish a structured knowledge base using Selenium
- Compiled a 30,000-expert database by programmatically cross-referencing rare disease studies via NCBI’s ClinicalTrials API using requests and PyArrow
- Designed and implemented an advanced topic-clustering search algorithm using BERTopic, improving relevance and discoverability
- Built a modern, responsive web interface with React and Next.js, optimized for user experience, readability, and accessibility
- Deployed on Vercel with CI/CD automation, enabling seamless updates and reliable production performance
Challenges we ran into
The first big challenge we ran into pretty early on was the lack of public information, particularly centralized public information, available in the medical field - especially considering we were approaching such a niche problem. We had to utilize web scraping, as we weren’t able to find any centralized database for rare diseases that developers could access. In addition, there was no easy way or one consolidated source to find “rare disease doctors” - OrphaNet, a rich database of rare diseases, only contains information about professionals and institutions in Europe, while resources like the NPI registry fail to provide ways to search for “rare disease” doctors, as rare disease doctors could be categorized under a variety of taxonomies. In the end, we had the idea to reference studies as a way to find leading experts in rare diseases. Another big challenge we ran into was optimizing our sorting and clustering algorithm since we had to balance accuracy with speed when handling our large dataset. To solve this, we implemented a dual-path algorithm architecture where simple queries use fast cluster lookups while complex searches use a weighted scoring system, combined with caching strategies that precompute normalized text, regex patterns, and keyword lookups to avoid recalculation. On the search side, we struggled with normalizing queries so that variations like “Parkinson,” “Parkinson’s,” and “Parkinsons” matched correctly, and we also had to filter out irrelevant non-medical keywords without losing valuable context.
Accomplishments that we're proud of
- 11,000+ rare diseases catalogued
- ~30,000 rare disease experts found, including location and contact information
- 165 different countries represented across our dataset, including Vatican City!
- Sub-700 millisecond search algorithm response time
- Frontend and backend integration and deployment via Vercel
What we learned
For the majority of us, this is our first hackathon, so there was certainly a learning curve. We’re all incredibly talented programmers in our own right, but getting a team together to design, delegate, and ultimately create a project that we’re all proud of was an entirely different endeavor that came with its own challenges (and rewards). Learning not only to code in a group environment, but also utilize tools like GitHub to streamline our collaborative process and adapting to that whole structure was, while at times frustrating, incredibly rewarding and valuable. We are now fluent in git commands, and have all taken away at least one key lesson: pull often and commit your changes! In addition, none of our team had any fullstack experience, as a lot of us were more experienced with backend. Although we had a vision for how our frontend would look, implementing it was an entirely different challenge. Sure, most of us could code simple HTML, CSS, and JS, but learning more sophisticated technologies like React and Next.js helped us realize our vision. Not only that, but integrating our frontend and backend was an entirely different challenge that felt like a true accomplishment when it finally clicked.
What's next for MedConnect
In our current version of MedConnect, we have a fixed database extracted from the ClinicalTrials.gov API. In the future, we intend on updating our database in real time, adding new researchers/specialists as new papers and breakthroughs emerge. To improve functionality, MedConnect will have a filtering option that allows physicians to specify which locations they’d like to search in and if certain insurance plans cover the cost of seeing the specialist. Furthermore, we’d like to implement a visualization of the topic clustering we use to determine which specialists are “most relevant” to the physicians’ search. Since oftentimes many symptoms of rare diseases are unknown or closely related, this function may help physicians determine other closely related rare diseases that their patient (could potentially) fall under, as misdiagnosis is also a large issue in healthcare.
Built With
- bertopic
- clinicaltrials.gov
- flask
- nextjs
- pyarrow
- python
- react
- selenium
- typescript
- vercel

Log in or sign up for Devpost to join the conversation.