Medical Bill Analyzer - Bill Express

Inspiration

We come from immigrant families who share the struggle of not being able to read or understand medical documents especially bills correctly. We realized that this could be a problem for many across NYC.

What it does

Allows users to upload both their Patient Statement and their Explanation of Benefits (EOB) simultaneously (or other medical bills/statements too!)
Using a local Llama 3.2 model, the system compares what the hospital is charging against the insurance company's records to identify discrepancies.
Instead of just summarizing text, the AI "consults" a specialized knowledge base of medical billing codes and financial assistance policies to provide a tailored & digestible summary.
Generates reports in English, Spanish, and French to serve New York City’s diverse communities at the moment.
Integrates ElevenLabs to read the analysis aloud, providing a more accessible experience for users with visual impairments or users who just prefer to hear information.

How we built it

We have a front end in Vite, React.js, HTML, CSS & Bootstrap that has buttons that signal to our middle communicator (Flask) to send data to the RAG pipeline we built and the voice services. Through Flask, the backend in mainly Python receives & gives back data.
We used the elevenlabs API key to access & process audio in multiple languages
Our RAG pipeline is built with a trained LLM, llama3.2 from Ollama and its interface. The Ollama processes data through text extraction. Text is extracted from images through DocTR. The system searches a ChromaDB vector database for health regulations and billing standards relevant to the uploaded text.
We researched a bunch of publicly downloadable resources from NewYork-Presbyterian (Weill Cornell) and closest authoritative alternatives for RAG training on medical bills / EOBs for our RAG pipeline & LLM
The summary and the language selected based on the front end is passed through ElevenLabs, which outputs a voice in the user's chosen language. We also processed language through Langchain.

We researched lots of medical documents and relevant information relating to the New York Presbyterian system, which statistically, a lot of New Yorkers go to. An area of focus was Weill Cornell, in order to narrow down the data needed for the RAG processor. We researched commonly spoken languages in that specific hospital.

Challenges we ran into

We were originally going to add Chinese instead of French to cater to the NYC demographics, however, the model we are using (llama3.2) wasn't able to accurately translate it and used Pinyin instead, which would be an accessibility issue to older Chinese people who would prefer a Chinese script
It takes a long time for the RAG pipeline & our model to load data
We were thinking of uploading medical bills or information but realized that would be a privacy violation. Instead, we trained our model on synthetic medical "bills" and insurance statements.

Accomplishments that we're proud of

We were able to create a front end and connect it successfully to our LLM's output
We were able to switch between and output in those languages
We achieved a product that _ can _ be used to make healthcare information to be more accessible across non-English speakers! It also benefits English speakers because of the easily summarized billing and insurance information, so elderly or people who don't want to bother with long documentations can use it too!

What we learned

We learned how a RAG processor looks like, and how to train an LLM, which is exciting for all of us because it was something new, and we got to incorporate it into our project.
Almost all of our teammates were new to connecting frontend code (HTML, CSS, Javascript) to backend (Python), we learned the process of bridging the two.
We also learned the importance of project management!

What's next for Medical Bill Analyzer - Bill Express

Train our LLM with comprehensive medical bill information by working with more NYC hospitals
Expand language content by supporting audio and text in other languages
Make data uploading more secure, upgrading from open-source material
Cater suggestions for payment/insurance to the user with a deeper model

Built With

chromadb
css3
elevenlabs
flask
html5
huggingface
langchain
ollama
python
react
vite

Submitted to

Hack Brooklyn
- Winner Best Beginner Hack

Created by

Nathan Chin - Front-end: React + Vite

Nathan Chin
I dipped my feet in both front-end & backend! I've worked with Flask before but not as complicated as this. I also added functions to the front end to take input and then fetch output from the backend. I specifically worked on having the translations show up on the application and connecting that to Flask. I made sure that the RAG processor didn't create multiple databases when one already exists. I learned a little here and there about how the RAG processor works and what information is trained on!

Musfirat Rahman
I contributed to front-end development and UI design using Bootstrap and React.

Aqual 336
I worked primarily on the backend, building the primary OCR and RAG processors responsible for generating and translating summaries. I also built the ElevenLabs script that allows for audio transcription of summaries in multiple languages.

Michelle Fronda