About B.O.B.B.I.E.

Bedrock-Orchestrated Baseline & Behavior Intelligence Engine


Inspiration

Federal compliance is broken. Security assessors spend weeks manually reviewing logs, documents, and configurations to validate NIST SP 800-53 controls — a process that's expensive, error-prone, and produces point-in-time snapshots instead of continuous insight. Having worked in and around federal security, we knew there had to be a better way. When Amazon Nova Pro launched with its 128K context window and production-grade reasoning, we saw the opening: an AI system capable of ingesting entire OSCAL System Security Plans, cross-referencing live AWS telemetry, and reasoning over Windows Event Logs — all in a single assessment run. BOBBIE was born from the frustration that the agencies who need security the most are often the ones least equipped to keep up with it.


What it does

BOBBIE (Bedrock-Orchestrated Baseline & Behavior Intelligence Engine) is a multi-agent AI system that automates NIST SP 800-53 Rev 5 security control assessments. It dispatches family-based AI agents in parallel — each specialized for a different control domain — to collect and evaluate evidence from AWS CloudWatch, AWS Systems Manager, OSCAL documents, Windows Event Logs, NIST NVD, and CISA KEV. The result is a complete compliance assessment with a scored report, prioritized findings, and a machine-readable POA&M ready for engineering handoff — delivered in minutes instead of weeks.


How we built it

The core stack is Amazon Nova Pro (via AWS Bedrock) orchestrated with LangChain. We designed a hierarchical multi-agent architecture: one master orchestrator dispatches control assessments to specialized family agents, each owning one NIST control family. Agents run in parallel with timeout isolation so a single failure never crashes the assessment. We built deterministic pre-check layers for objective criteria — patch SLA calculations, audit log field completeness, password policy thresholds — and pass that structured context to Nova Pro for reasoning, narrative generation, and remediation recommendations. Output artifacts are emitted as structured JSON (assessment report, POA&M) and a human-readable summary. The CLI runs a full 10-control assessment end-to-end with a single command.


Challenges we ran into

The biggest challenge was control atomicity — ensuring each NIST control had exactly one owning agent, with no duplication or gaps. We had to build runtime routing validation into the orchestrator to enforce this. Getting Nova Pro to produce consistent, structured outputs across all agent types required careful prompt engineering and hybrid deterministic logic to anchor the AI's reasoning to verifiable evidence. We also had to design graceful degradation for AWS connectivity — the system needed to operate in demo/mock mode without live AWS credentials while still exercising the full assessment pipeline. Scrubbing sensitive data before going public was a late-stage reminder that dev shortcuts can become real security issues.


Accomplishments that we're proud of

We're proud that BOBBIE actually runs. A full 10-control assessment executes from CLI, produces real artifacts, and the outputs are coherent, auditable, and structured for downstream use. The family-agent architecture held up under pressure — parallel execution, failure isolation, and deterministic scoring all work as designed. We also built a proper evidence-driven foundation rather than a prompt-wrapper; every finding traces back to a specific data source and check. That matters in federal compliance where auditability isn't optional.


What we learned

Amazon Nova Pro's 128K context window is genuinely transformative for document-heavy domains like compliance — ingesting an entire OSCAL SSP in a single call and reasoning over it coherently was something we weren't sure would work until it did. We also learned that the hardest part of agentic AI isn't the AI — it's the data wrangling, schema validation, and failure handling around it. Hybrid deterministic + AI architectures are significantly more reliable and auditable than pure LLM pipelines for use cases where correctness is non-negotiable.


What's next for B.O.B.B.I.E.

The current demo covers 10 controls across 8 NIST control families. The immediate roadmap is expanding to the full 800-53 Rev 5 High baseline — all 20 control families, priority-ranked by ATO impact. Beyond coverage, we want to build out the end-to-end authorization workflow: continuous monitoring with scheduled assessment runs, delta reporting to surface new findings between cycles, integration with enterprise tools like ServiceNow and Splunk, and a reviewer interface for ISSOs to approve, annotate, and track remediation. The ultimate goal is making continuous authorization practical for agencies that currently only get a compliance snapshot once a year — turning BOBBIE from an assessment tool into an always-on compliance co-pilot.

Built With

  • amazon-nova-pro
  • aws-bedrock
  • aws-cloudwatch-logs
  • aws-systems-manager
  • boto3
  • cisa
  • langchain
  • nist-nvd-api
  • nist-oscal
  • pydantic
  • python-3.10+
  • python-evtx
Share this project:

Updates