Inspiration
Have you ever looked at a blood test result and felt like you were reading an ancient, undecipherable script? You see values like Albumin or HBA1c followed by numbers, but unless they are highlighted in red, you have no idea what they mean for your long-term wellness.
The inspiration for MediScan.ai came from the realization that while healthcare is becoming more data-driven, the "data" remains locked behind medical jargon. I wanted to build a tool that empowers patients to walk into their doctor's office with confidence, armed with clear, visualized insights and the right questions to ask.
What it does
MediScan.ai is a comprehensive medical intelligence portal that transforms static, confusing health documents into an interactive, visual experience. It acts as a bridge between the clinical laboratory and the patient's understanding.
Here is a breakdown of the core functionalities:
- Multimodal Document Processing The app utilizes advanced Optical Character Recognition (OCR) to "read" medical reports. Whether it is a high-resolution PDF or a slightly blurry smartphone photo of a printed blood test, the AI extracts biomarkers, units, and reference ranges with high precision.
2.Serious Condition Detection The system logic scans for high-risk biomarkers that indicate serious medical conditions (e.g., Anemia, Hyperglycemia, or Kidney dysfunction).
Severity Tagging: Categorizes risks as "Mild," "Moderate," or "Critical."
Condition Explanation: Provides a plain-English explanation of what the condition is and why the specific data points triggered the alert.
- AI Clinical Assistant (Contextual Chat) The built-in chat interface is "context-aware." Because it has already analyzed your report, you don't need to re-type your results. You can ask:
"What foods should I avoid based on my glucose levels?"
"What follow-up questions should I ask my doctor about these liver enzymes?"
How we built it
The project is built on a modern, high-performance tech stack designed for speed and clarity:
Frontend: Built with React and Tailwind CSS. I utilized a chatbot architecture with a persistent chatbox to manage the complexity of medical data.
Intelligence: The "brain" of the app is the Gemini 3 Flash API. Its native multimodal capabilities allow it to process both PDFs and smartphone photos of reports with high OCR accuracy.
State Management: React's useState and useEffect hooks handle the transition from the landing page to the live analysis portal, ensuring a seamless user experience.
Challenges we ran into
The journey wasn't without its hurdles:The "Role" Requirement: One of the most frustrating challenges was the GoogleGenerativeAIError. The API strictly requires that chat history starts with a user role. I had to "seed" the conversation with a hidden user prompt to make the initial analysis feel like a natural part of the chat.Data Reliability: Medical data is sensitive. A major challenge was ensuring the AI didn't "hallucinate" values. I solved this by refining the system instructions to prioritize accuracy and flag any data it wasn't 100% sure about.Layout Complexity: Fitting a Sidebar, a Visual Dashboard, and a Chatbox on a single screen without it feeling cluttered required multiple iterations of Tailwind's grid system.
Accomplishments that we're proud of
Building a medical AI application in 2026 is no small feat, especially when balancing clinical accuracy with a high-end user experience. Here are the milestones we are most proud of achieving with MediScan.ai:
- Mastering "JSON-Strict" Multimodal Logic One of our biggest technical wins was forcing a multimodal model (Gemini 3 Flash) to act as a structured data engine. Standard AI often responds with conversational "fluff," but we successfully engineered prompts that ensure the model returns a perfectly formatted JSON schema. This allowed us to map raw medical values directly into our visual components without manual data entry.
What we learned
Building this project was a masterclass in Prompt Engineering and Data Structuring. I learned that:
AI is only as good as its constraints: To build the Visual Scoreboard, I had to learn how to force an LLM to output pure, valid JSON rather than conversational text.
Multimodal Context is King: Gemini 3's ability to "see" a document means it understands the spatial relationship of data (e.g., a value sitting next to its reference range) better than a simple text-scraper.
What's next for MediScanAI
MediScan.ai is more than just a report reader; it’s a step toward a world where health literacy is a right, not a privilege. By turning "scary" numbers into friendly charts, we can help people take ownership of their biological story.
Log in or sign up for Devpost to join the conversation.