Multi-Agent AI Governance System

Our Inspiration

Generative AI is a double-edged sword. Its potential to revolutionize industries is matched only by its risks: data leaks, harmful content, and a lack of regulatory compliance. We were inspired not to limit AI, but to enable it. Our goal was to build the "seatbelt" for Large Language Models—an intelligent guardian that allows businesses to innovate confidently, knowing that their AI interactions are safe, ethical, and fully governed.

How We Built It

We designed our solution as a Multi-Agent AI Governance System, creating a robust, layered defense that analyzes every interaction in real-time. Our journey involved building five specialized agents, each with a critical mission:

Prompt Guard & Policy Enforcer: Our first line of defense, checking user input for inappropriate content and verifying role-based permissions.
Output Auditor: A final check to ensure the AI's response adheres to our safety policies.
Advisory Agent: A unique feature where we use the LLM itself to provide helpful, AI-generated explanations to the user whenever a request is blocked, ensuring transparency.
Audit Logger: The system's memory, recording every detail of every transaction to Amazon DynamoDB for complete traceability.

The entire application was built on a modern, serverless stack using Python and Streamlit for a reactive UI. The core AI capabilities are powered by Amazon Bedrock, giving us access to state-of-the-art models like Claude 3 Sonnet.

To polish the project, we also implemented a fully bilingual interface (English/Spanish), response caching for performance, and a user feedback mechanism.

Challenges We Faced

Our most significant challenge wasn't in the AI logic, but in the real-world complexities of cloud deployment. After building a stable local application, deploying to the cloud revealed a cascade of issues, from a NoRegionError to what seemed like a "stuck" deployment cache in Streamlit Cloud.

We systematically debugged the entire chain:

We corrected the code to explicitly define the AWS region.
We troubleshooted GitHub App permissions for private repositories.
We refactored the code to use "lazy initialization" for the AWS client, a best practice for cloud environments.
Finally, we isolated the issue to the way Streamlit Cloud handled secrets and successfully reconfigured them using standard environment variables.

Overcoming these infrastructure challenges was a major part of the project and a phenomenal learning experience in building real-world, cloud-native applications.

What We Learned

This hackathon was an incredible journey. Our key takeaways are:

The Best Agent is a Great Model: Our most fascinating discovery was seeing the Claude 3 model itself act as a governance layer, gracefully refusing harmful or out-of-scope requests that our custom agents missed. It proved that a modern approach to AI safety relies on both custom rules and a powerful, well-aligned foundation model.
Cloud is More Than Code: A working application is only half the battle. We learned firsthand the importance of understanding IAM roles, service permissions (like Bedrock Model Access), and environment configuration to make a project succeed in the cloud.
The Power of Systematic Debugging: Faced with persistent deployment errors, we learned to isolate variables, form a hypothesis, and test it methodically. This structured approach was key to solving the final, complex challenges.

Built With

amazon-dynamodb
bedrock
boto3
iam
langdetect
python
streamlit

Updates

Sebastian Diaz Gaviria started this project — Jul 03, 2025 03:00 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.