Inspiration

Many doctors have hard-to-read handwriting, leading to patients misunderstanding what to do.

What it does

It uses a custom model we fine tuned called Florence-2 From Microsoft fine tuned using over 3000 images that achieves 94% accuracy*! This model reads the text, gives it to a text llm with custom instructions and RAG to refer to trusted sources(e.g. Mayo Clinic) for information.

How we built it

The frontend is written with HTML, CSS, and JavaScript. As for the AIs, we used Microsoft's Florence 2 model and fine tuned, and general Llama3.1. We used python with pytorch for this. All models were run and fine tuned locally on an M5 Macbook Air with 32gb Unified memory using an external fan so it wouldn't explode(it's a fanless laptop💀).

Challenges we ran into

We tried deploying on vercel and using a groq model, but it ran into many issues, and as the timer ticked, we had to give in and run the text llm and frontend locally(good thing we had a person with enough ram for all that!).

Accomplishments that we're proud of

Getting it to work(30 minutes before the deadline💀). Fine tuning a model(though just getting it to work was a miracle...)

What we learned

We learned how to fine tune models, and to not procrastinate until the last second so such a situation does not occur💀

What's next for MediScan

We plan to increase the training data for the vision model and increase its capabilities, as well as refine(and possibly fine tune) the text llm, using stricter RAG to force it to adhere to those sources, as right now, there is possibility it ignores the instructions.

AI declaration

AI was used heavily in the development of this program. AI wrote most of the frontend, helped correct the breaking of the groq model, and also helped in minor ways with the fine-tuning script for the model.

  • 94% accuracy on 780 images of single words taken with perfect lighting conditions and zoomed to fill the screen

We all from Parsippany Hills High School in case school is required.

Share this project:

Updates