CGAdvocacy

Inspiration

The American healthcare system is guided by a very complicated collection of laws, regulations, and rules. Every week the government releases new proposals to change these rules, asking for input from the public. We created CGAdvocacy because we noticed that many people, even experts, find it hard to understand and get involved with complicated policy documents. These documents make important changes that affect all of us. The government often asks for public comments on proposed rules, but most people don’t know about this or how to give useful feedback. This leads to ineffective regulations, missed opportunities for valuable insights, poor data governance, and a lack of oversight, resulting in fraud, abuse, and potential losses of $978 billion annually.

Only a small percentage of public comments on healthcare policies come from the general public. This reveals a big unmet need for a tool that makes these documents easier to understand. CGAdvocacy was developed to help everyone—patients, doctors, advocates, and researchers—find and comment on healthcare issues that matter to them.

What it does

CGAdvocacy is a tool that helps people understand and review complicated health policy documents, then helps them deliver their ideas back to the government for consideration. The tool aims to make it easier for everyone to participate in shaping healthcare policies.

Here’s how CGAdvocacy works and its intended impact:

Customizes Information for You

Highlights Relevant Parts: Many policy documents can be thousands of pages long and deal with hundreds of distinct unrelated issues. CGAdvocacy focuses attention on the parts of the document that matter most to you.

Tailors Content: Adjusts information based on your background and interests.

Simplifies Complex Documents

Breaks Down Language: Converts long, complex policy documents into easy-to-understand language.

Explains Tricky Terms: Provides clear explanations for difficult terms and concepts.

Speeds Up Workflows: We used CGAdvocacy to understand a 1900-page healthcare policy document in under 15 minutes, significantly reducing review time.

Helps You Give Feedback

Guides the Process: Provides step-by-step guidance for writing and submitting comments on policies.

Improves Feedback: Suggests ways to enhance your comments and check them for improvements.

Ensures Your Voice is Heard: Collects and organizes feedback to ensure it's delivered to the right people.

Makes Participation Easier

Increases Access: Makes important policy information more accessible and understandable.

Finds Relevant Parts Quickly: This helps you quickly identify the most relevant sections of long documents. Enables Meaningful Participation: Assists you in writing effective comments that can influence policy decisions.

How we built it

We built CGAdvocacy by combining our knowledge in healthcare policy, AI, data retrieval, cloud software development, and prompt engineering. Here’s how we did it:

Frontend Development:

Python and Streamlit: We used Python and Streamlit due to the ease of integrating the backend with an inviting, intuitive interface.

UX/UI Design: Streamlit was chosen for its ease of use and user-friendly navigation.

Backend Development:

Python and Azure Cosmos DB: Our backend uses Python and Azure Cosmos DB for MongoDB, which helps store and manage unstructured data like our documents efficiently.

Cosmos Vector Search: We use Cosmos Vector Search to make relevant document partition retrieval more accurate.

Azure Tools: Used for cloud services monitoring and deployment to ensure scalability and reliability.

AI and NLP Models:

Document Parsing: We used PyPDF2 for document preprocessing. The metadata recorded when parsing the PDFs, such as page and paragraph numbers, were essential for precise document text retrieval.

Semantic Embeddings and GPT-4: Generated semantic embeddings using text-embedding-3-small and performed text completions with GPT-4-32k. The model with larger input token limits allowed us to offer more retrieved document context in a prompt.

Vector Search: Performed vector searches (cosine similarity between embedding vectors) to find the appropriate data needed for the AzureOpenAI chat model.

Using these technologies, we created a strong platform aiming to help users participate in the policy-making process more effectively.

Challenges we ran into

Creating CGAdvocacy had several challenges:

AI Integration - RAG: Ensuring the AI accurately understood and simplified policy documents required a lot of tuning and working with document pre-processing, vector storage, retrieval methods, and LLM prompts. For example, we did many iterations of prompt engineering to get the accuracy we wanted in document summarization.

Transparency: Keeping the source data for the document visible while using retrieval-augmented generation (RAG) techniques was essential for accuracy and trustworthiness. We leveraged a transparent system over a low-code one so that users can verify the information with visibility into the original documents.

Azure Model Token: We had to work within the limits of accessible AzureOpenAI models, which restricted access to some advanced models like GPT-4o and OpenAI Assistants. We optimized our use of available models, such as choosing GPT-4-32k over GPT-4, to give the best results.

Accomplishments that we're proud of

Developing a Robust AI-powered Document Processing System: We built a sophisticated AI-powered system that simplifies complex policy language into clear, understandable summaries. This was done through document pre-processing, RAG system tuning, and prompt engineering.

Deployment Solutions: We deployed our system using Azure container services. This involved setting up a robust database, establishing seamless calls to OpenAI, and creating a user-friendly interface connected to a backend inside a container app. This ensures our platform is efficient and scalable.

Prompt Engineering: We tailored our AI prompts by talking with end-users to understand their needs and interests. We tracked user inputs and LLM response data to create a series of interconnected OpenAI calls that kept context throughout, providing relevant and coherent outputs.

Successful Integration with Azure Tools: By using Azure tools for cloud services and deployment, we ensured that our platform was both scalable and secure. This setup lays a solid foundation for future enhancements and broader adoption.

Stakeholder Engagement: Throughout the development process, we actively sought feedback from diverse community members and stakeholders. This feedback was key to refining our platform, making sure it meets the needs of those it is designed to serve.

What we learned

Developing CGAdvocacy taught us valuable lessons in several key areas:

AI/NLP Model Usage and Prompt Engineering: We learned how to effectively use AI and natural language processing (NLP) models to simplify complex policy documents. Tuning prompts and optimizing model responses were essential for ensuring accurate and reliable outputs.

Retrieve and Generate (RAG) Techniques: Implementing RAG techniques allowed us to enhance document handling. By combining retrieval methods with generative models, we improved the accuracy and relevance of the information provided to users.

Azure Tools: Using Azure's cloud services for hosting, database management, and AI model deployment was crucial for building a scalable and secure platform. Azure's strong infrastructure and tools enabled us to maximize performance and reliability.

MongoDB: Using MongoDB for storing and managing document data highlighted the importance of efficient data structuring and retrieval. MongoDB's flexibility and scalability were key in handling large, complex documents quickly and effectively.

Integration and Deployment: Overcoming the challenges of integrating various technologies and deploying our solution emphasized the need for flexibility and adaptability. Azure's seamless integration capabilities and MongoDB's efficient data handling ensured our platform remained accessible and functional, even with local hosting limitations.

These insights have helped us create a robust, user-friendly platform that makes healthcare policy information accessible to everyone.

What's next for CG Advocacy

We have several exciting plans for near-term future development and enhancement of CGAdvocacy:

Enhance AI Capabilities:

Diverse Document Handling: Improve the AI's ability to process and simplify a wider range of policy documents and input sources, making the platform more versatile and effective.

Direct Government API Integration: Establish direct connections to government APIs for real-time fetching and filtering of policy documents. This will ensure that users can access the latest and most relevant information.

Timeline: Q3 2024

User Engagement and Notifications:

New Build + Push Notifications: Implement push notifications to keep advocates and consumers informed about issues they care about. Users will receive timely updates on new policy proposals, deadlines for public comments, and other relevant information.

Automatic Submission of Comments: Develop features for the automatic submission of user comments to relevant government platforms, streamlining the participation process and ensuring that user feedback is consistently delivered.

Timeline: Q4 2024

To ensure the continued success and growth of CGAdvocacy longer-term, we have identified several key strategies:

Human-Centered Design (HCD)

User Research: Engage with diverse stakeholders—academics, healthcare practitioners, patients, and community organizations—to gather insights on their specific needs and pain points.

Persona Development: Create detailed personas for each stakeholder group to guide the design and development of AI tools. Iterative Feedback: Regularly gather feedback from users to refine and improve the platform.

Explainable AI (XAI)

Transparency: Ensure that AI's decision-making processes are transparent, especially when filtering and simplifying policy documents.

User Education: Provide educational resources that explain how AI works and how it benefits users.

Feedback Mechanism: Allow users to provide feedback on AI-generated content to continuously improve accuracy and relevance.

Ethical AI Design

Bias Mitigation: Implement strategies to identify and mitigate biases in AI models to ensure fairness and inclusivity.

Data Privacy: Ensure robust data privacy measures to protect sensitive information, particularly from healthcare practitioners and patients.

Ethics Review Board: Establish an ethics review board to oversee the ethical implications of AI applications.

Data Quality and Management

Diverse Data Sources: Utilize a wide range of data sources to train AI models, ensuring they reflect diverse perspectives.

Regular Updates: Continuously update datasets to incorporate new information and maintain model accuracy.

Data Governance: Implement strong data governance practices to ensure data integrity and compliance with regulations.

User-Centric AI Solutions

Personalization: Tailor content and features to meet the specific needs of different user groups.

User-Friendly Interface: Design an intuitive and accessible user interface that simplifies interaction with the platform.

Clear Instructions: Provide clear instructions and contextual help to guide users through the platform’s features.

Scalability and Flexibility

Modular Architecture: Develop a modular architecture that allows easy updates and integration of new features.

Cloud Infrastructure: Utilize cloud infrastructure to scale the platform based on user demand.

APIs and Integrations: Offer APIs and integrations with other tools and platforms to enhance functionality and user experience.

Built With

Streamlit: A UI for Python front ends, chosen for its ease of implementation and user-friendly navigation.

Text Parsing: Utilized for breaking down and analyzing complex policy documents, enhancing readability and comprehension.

AI/NLP Models: Employed for understanding and simplifying policy documents, using state-of-the-art natural language processing techniques.

RAG (Retrieve and Generate): Enhances document handling and response accuracy, combining retrieval methods with generative models.

Azure Tools: Leveraged for cloud services, deployment, and data management, ensuring scalability and reliability.

MongoDB: A non-relational database used for storing document data, known for its flexibility and scalability.

Built With

ai
api
azure
canva
mongodb
natural-language-processing
parsing
rag
streamlit
text
ux

Submitted to

Microsoft Developers AI Learning Hackathon

Created by

I identified the problem, scoped the vision, and recruited the development team.

Christine Galligan
I worked on the document parsing and storage into CosmosDB. I also learned and applied LangChain RAG approach to finding answers about the documents in the stored data.

Andrew Bovey
Christine Galligan, MHA
Kevin Gilbert