Inspiration

We come from immigrant families who share the struggle of not being able to read or understand medical documents especially bills correctly. We realized that this could be a problem for many across NYC.

What it does

  • Allows users to upload both their Patient Statement and their Explanation of Benefits (EOB) simultaneously (or other medical bills/statements too!)
  • Using a local Llama 3.2 model, the system compares what the hospital is charging against the insurance company's records to identify discrepancies.
  • Instead of just summarizing text, the AI "consults" a specialized knowledge base of medical billing codes and financial assistance policies to provide a tailored & digestible summary.
  • Generates reports in English, Spanish, and French to serve New York City’s diverse communities at the moment.
  • Integrates ElevenLabs to read the analysis aloud, providing a more accessible experience for users with visual impairments or users who just prefer to hear information.

How we built it

  • We have a front end in Vite, React.js, HTML, CSS & Bootstrap that has buttons that signal to our middle communicator (Flask) to send data to the RAG pipeline we built and the voice services. Through Flask, the backend in mainly Python receives & gives back data.
  • We used the elevenlabs API key to access & process audio in multiple languages
  • Our RAG pipeline is built with a trained LLM, llama3.2 from Ollama and its interface. The Ollama processes data through text extraction. Text is extracted from images through DocTR. The system searches a ChromaDB vector database for health regulations and billing standards relevant to the uploaded text.
  • We researched a bunch of publicly downloadable resources from NewYork-Presbyterian (Weill Cornell) and closest authoritative alternatives for RAG training on medical bills / EOBs for our RAG pipeline & LLM
  • The summary and the language selected based on the front end is passed through ElevenLabs, which outputs a voice in the user's chosen language. We also processed language through Langchain.

We researched lots of medical documents and relevant information relating to the New York Presbyterian system, which statistically, a lot of New Yorkers go to. An area of focus was Weill Cornell, in order to narrow down the data needed for the RAG processor. We researched commonly spoken languages in that specific hospital.

Challenges we ran into

  • We were originally going to add Chinese instead of French to cater to the NYC demographics, however, the model we are using (llama3.2) wasn't able to accurately translate it and used Pinyin instead, which would be an accessibility issue to older Chinese people who would prefer a Chinese script
  • It takes a long time for the RAG pipeline & our model to load data
  • We were thinking of uploading medical bills or information but realized that would be a privacy violation. Instead, we trained our model on synthetic medical "bills" and insurance statements.

Accomplishments that we're proud of

  • We were able to create a front end and connect it successfully to our LLM's output
  • We were able to switch between and output in those languages
  • We achieved a product that _ can _ be used to make healthcare information to be more accessible across non-English speakers! It also benefits English speakers because of the easily summarized billing and insurance information, so elderly or people who don't want to bother with long documentations can use it too!

What we learned

  • We learned how a RAG processor looks like, and how to train an LLM, which is exciting for all of us because it was something new, and we got to incorporate it into our project.
  • Almost all of our teammates were new to connecting frontend code (HTML, CSS, Javascript) to backend (Python), we learned the process of bridging the two.
  • We also learned the importance of project management!

What's next for Medical Bill Analyzer - Bill Express

  • Train our LLM with comprehensive medical bill information by working with more NYC hospitals
  • Expand language content by supporting audio and text in other languages
  • Make data uploading more secure, upgrading from open-source material
  • Cater suggestions for payment/insurance to the user with a deeper model

Built With

Share this project:

Updates