Inspiration
Traditional pharmacovigilance (drug safety monitoring) is often reactive and slow, relying on manual reports that take months to process. We were inspired to build SentinelRx to create a "Real-Time Nervous System" for drug safety—one that identifies emerging side effects from both official FDA streams and social medical discussions (/r/Medicine) before they become public health crises.
What it does
SentinelRx is an autonomous surveillance engine that:
Ingests massive streams of data from Kafka, including FDA FAERS reports and medical social media. Analyzes text using BioBERT to extract specific drug-adverse event pairs with clinical precision. Cross-references signals against the SIDER dataset to distinguish between "Known" side effects and "Novel" emerging risks. Generates high-fidelity clinical narratives using MedGemma-7B reasoning, providing doctors with pharmacological explanations for why a specific reaction is occurring.
How we built it
We architected a high-concurrency pipeline:
Back-end: Python and FastAPI coupled with Kafka for stream processing. Analytics: DuckDB for lightning-fast disproportionality calculations (PRR/ROR). AI Stack: BioBERT for Named Entity Recognition, GPT-4o for narrative synthesis, and a simulated MedGemma-7B layer for specialized clinical grounding. Front-end: A premium Glassmorphism Neon Dashboard built with Vanilla CSS/JS for a high-performance, immersive medical monitoring experience.
Challenges we ran into
One of our biggest hurdles was "Signal Noise"—differentiating between a random report and a statistically significant safety signal. We solved this by implementing a rigorous disproportionality analysis engine. Additionally, simulating the specialized clinical reasoning of MedGemma without requiring massive 15GB local downloads was a complex engineering feat that involved grounding our primary LLM with curated medical knowledge-bases.
Accomplishments that we're proud of
We are particularly proud of our "Uniqueness Pipeline": the integration of MedGemma reasoning and SIDER benchmarking. Seeing the system correctly identify a "Novel" signal and then explain the underlying cytokine modulation or receptor pathway feels like looking into the future of automated medicine.
What we learned
Building SentinelRx taught us that for high-stakes fields like healthcare, generic AI isn't enough. Domain-specific grounding (like BioBERT and MedGemma) is essential for accuracy. We also learned how to bridge the gap between "Big Data" (millions of FDA records) and "Human-Readable Insights" (a 1-page clinical narrative).
What's next for SentinelRx
The next step for SentinelRx is deep integration with Private Electronic Health Records (EHR) to move from "population-level" signals to "person-specific" safety alerts. We also plan to deploy the full MedGemma-7B model on dedicated medical hardware to further heighten the precision of our clinical reasoning engine.
Built With
- api
- containerization:
- css
- datasets:
- docker
- duckdb-(olap-analytics)-frameworks:-fastapi
- external
- gpt-4o-(synthesis)
- html-ai/nlp-models:-biobert-(ner)
- javascript
- langchain-apis:-fda-openapi
- medgemma-7b-(clinical-reasoning)-data-infrastructure:-apache-kafka-(streaming)
- praw
- python
- sider
Log in or sign up for Devpost to join the conversation.