Inspiration
Modern drug safety monitoring relies heavily on reporting systems like the FDA Adverse Event Reporting System (FAERS). However, most adverse drug events are never formally reported to these systems, so there is a lack of symptom data on these websites. While patients may not submit reports to these regulators, they often discuss their experiences online in forums or communities dedicated to specific medications and conditions.
We were inspired by the idea that these conversations contain valuable evidence that is often available long before traditional reporting systems can identify emerging trends. We wanted to build a platform that could transform unstructured patient experiences into actionable drug safety insights and help bridge the gap between patients, researchers, and regulators.
What it does
PharmaWatch is an ML-powered pharmacovigilance platform that transforms patient-reported experiences into early warning signals for emerging Adverse Drug Events (ADEs).
The website analyzes discussions from medication-specific Reddit communities, extracts drug-symptom relationships, and aggregates them into structured safety signals. These signals are then compared against FDA ones and validated using FAERS data.
Users can search for a medication and investigate potential adverse events that may not yet appear on official drug labels.
How we built it
Frontend: Built with React, Vite, Tailwind CSS, and DaisyUI, with Recharts powering interactive timelines and signal visualizations. Clerk handles authentication and user management.
Data & Backend: Firebase Firestore stores drug data, safety signals, platform statistics, and user watchlists. A Python pipeline processes Reddit data and uploads structured results to Firestore.
NLP Pipeline: Using BioBERT, BERT-based classifiers, and custom extraction logic, we identify drugs, symptoms, causality indicators, onset timing, and severity from patient discussions.
Signal Analysis: We aggregate reports into safety signals using reporter counts, onset consistency, causality confidence, and subreddit diversity, then compare them against FDA labeling data from DailyMed.
AI Assistant: We integrated Gemini 2.5 Flash with Google Search grounding to help users explore drug safety data and detected signals through a conversational interface.
We used Claude for assistance with debugging and minor code help.
Challenges we ran into
One of our biggest challenges was dealing with noisy social media data. Patients describe symptoms using informal language, abbreviations, and misspellings, which can sometimes make it harder to detect what is actually being talked about.
Another challenge was distinguishing true adverse event reports from more general conversations. Making sure we were able to differentiate between a user asking a question about whether a side effect exists and a user reporting that they personally experienced it was something we had to overcome.
We also had to make sure we thought about signal validation. This meant that building mechanisms for deduplication and reporter verification became essential to improving our website's reliability.
Accomplishments that we're proud of
We are proud of building a complete end-to-end platform rather than simply creating a proof-of-concept model. As well as being able to include data collection, NLP extraction, signal scoring, FDA label comparison, visualization, authentication, personalized dashboards, and an AI research assistant into a single experience.
We are also proud of how, instead of surfacing every reported symptom, PharmaWatch evaluates signal quality through multiple verification layers, making the results a lot more accurate and meaningful.
What we learned
Through this project, we gained experience working with natural language processing, information extraction, semantic similarity models, and large-scale data aggregation.
Most importantly, we learned how AI and ML can be used to enhance public health and drug safety efforts by extracting valuable insights from real-world patient experiences.
What's next for PharmaWatch
Our next goal is to expand beyond Reddit and incorporate additional sources of real-world evidence, including patient forums, public health datasets, and other healthcare communities.
We also plan to improve our NLP models with additional medical training data, implement stronger signal validation methods, and integrate live FAERS data for real-time comparisons.
Long term, we envision PharmaWatch becoming an open, accessible pharmacovigilance platform that helps researchers, clinicians, regulators, and patients identify emerging drug safety concerns earlier and more effectively than traditional reporting systems alone.
Log in or sign up for Devpost to join the conversation.