Regulatory Red-Team AI

Working App
Regulatory Red-Team AI

Inspiration

Every year, governments lose billions of dollars to tax evasion, sham consulting arrangements, offshore abuse, and weak compliance enforcement. Through exposure to finance, audit, IT support, and regulatory professionals, I noticed a recurring problem: regulators are often forced to trust submitted documents instead of actively challenging them.

Most existing compliance tools automate checklists or summarize documents. Very few attempt to reason like a skeptical regulator whose job is to find inconsistencies, question intent, and assess economic substance.

This project was inspired by a simple but powerful idea: What if AI didn’t act like a helpful assistant, but like a regulator trained to challenge everything?

What it does

Regulatory Red-Team AI is a multimodal AI system powered by Gemini 3 Flash Preview that simulates the reasoning process of a government regulator.

The system analyzes transaction descriptions, cross-checks uploaded documents such as PDFs, images, emails, and spreadsheets, identifies inconsistencies and red flags, evaluates the credibility of company explanations and produces a professional regulatory-style risk assessment

Instead of asking whether documents look valid, the system asks whether they would survive regulatory scrutiny.

How we built it

The project was built using:

Gemini 3 Flash Preview for multimodal reasoning, Google GenAI SDK, Google AI Studio, Streamlit for an interactive web interface and VS code.

The system follows a three-stage investigation pipeline:

Initial Investigation detects inconsistencies, red flags, and documentation gaps.
Evaluation Stage assesses the credibility of explanations and remaining risks.
Final Risk Assessment assigns a risk level (Low, Medium, High) and recommends regulatory action.

Each stage feeds into the next, creating a structured, regulator-style reasoning flow rather than a single AI response.

Challenges we ran into

One major challenge was working with rapidly evolving Gemini SDK APIs while ensuring compatibility with Gemini 3 Flash Preview.

Another challenge was handling multimodal inputs correctly. Raw file bytes cannot be passed directly to the model, so files had to be safely uploaded and referenced in supported formats.

Designing the AI’s behavior was also difficult. Most AI models are optimized to assist users, not challenge them. Significant effort went into prompt design to ensure the system remained skeptical, professional, and regulator-focused without hallucinating or overreaching.

Accomplishments that we're proud of

Successfully built a true multimodal regulatory reasoning system
Designed an AI that challenges submissions instead of summarizing them
Generated regulator-grade investigative memos automatically
Demonstrated a novel use of Gemini 3 beyond chat or summarization

What we learned

We learned that AI behavior is shaped as much by intent as by intelligence. Prompt design and reasoning structure matter more than raw model power.

We also learned that Gemini 3 excels at cross-modal inconsistency detection, especially when analyzing documents that appear legitimate at first glance but fail under scrutiny.

Most importantly, we learned that the next generation of AI systems must not only be helpful, but skeptical when operating in high-stakes domains.

What's next for Regulatory Red-Team AI

Future development plans include:

Jurisdiction-specific regulation packs
Confidence scoring for each identified red flag
Timeline reconstruction across multiple documents
Multi-agent regulator debate simulations
Integration with transaction databases and audit systems
The long-term vision is to help regulators, auditors, and compliance teams detect risk faster, more accurately, and at scale.

Built With

gemini
genai
googleaistudio
prompengineering
python
streamlit
vscode

Updates

Adeel Hassan started this project — Feb 09, 2026 04:16 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.