DocuDane

Inspiration

We noticed a lot of frustration many immigrants face when trying to understand lengthy and complex insurance forms. We wanted to create a tool that would simplify the process by summarizing each form section, making it easier for users to understand what is required and how to fill out the document properly using NLP and AI processes. Our goal is to make paperwork less daunting and more approachable.

What it does

DocuDane uses Google Cloud and its Vertex API to extract text from insurance forms and generate concise, easy-to-understand summaries for each form box/section. The tool breaks down the content and identifies key fields, providing explanations in a human-friendly manner. Each block of text is numbered on the form so users can easily correlate the summary with the corresponding section.

How we built it

We built DocuDane by combining two powerful technologies: Google Cloud Vision API to extract text from image-based forms and Google Vertex AI's generative models to summarize and simplify the content. Using OpenCV and the Python environment, we visually highlight form sections and pair each summary with its corresponding box in the form, providing an intuitive user experience.

Challenges we ran into

One of the major challenges was dealing with sensitive or complex text like personal information or ambiguous terms that AI models might misinterpret. Another challenge was ensuring that the summaries were both concise and informative, without losing important details required for filling out the forms accurately. Currently the project is not perfect, and still parses erroneous text from the form images on occasion and mistaking it as a valid form field. The biggest one though was managing how to scale the project, since we initially started out wanting to make a mobile cross-platform app utilizing the device's camera.

Accomplishments that we're proud of

We successfully integrated advanced AI models to solve a real-world problem in a way that’s practical and easy to use. We’re proud of how we were able to handle complex insurance forms and make the information significantly more digestible for the average user. The tool also provides a seamless user experience by visually guiding users through each form block with clear labeling and summaries.

What we learned

We learned a great deal about how to effectively leverage AI for natural language processing and document parsing. It was also an eye-opener in understanding how to manage sensitive text content and ensure that safety filters are respected while providing valuable, understandable information to users. We also learned the importance of knowing when to lower expectations and scale down to manageable goals and features.

What's next for DocuDane

Next, we plan to expand DocuDane to handle a wider variety of forms, including legal and medical documents, and implement multi-language support. We also aim to improve the AI's ability to summarize more nuanced form sections and explore options for integrating user feedback directly into the summaries, ensuring an even more personalized and accurate experience. We would also like to port this to a cross-platform mobile framework for wider usage through frameworks such as Kivy, React Native, or Flutter.

Built With

Submitted to

ShellHacks 2024

Created by

Generated the initial python base code that provided functionality to identify the object to focus on and scan the words in real time.

Jovan Pierre
I worked on tying together the backend and frontend utilizing openCV in Python with Google Cloud services

Jocelyn Dzuong
I love movies, building things, and most of all, eating noodles.
I worked in finding the service we used to implement AI. Google Clouds and its APIs was what we used, and I setup the account and helped with researching how to use its APIs.

Adrian E. Rodriguez Arcia
I worked on getting the Cloud Vision API to parse through the document to get its text and getting OpenCV to highlight the form boxes and words.

Robert Fontan

Updates

Jocelyn Dzuong started this project — Sep 29, 2024 09:41 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.