Inspiration
With the increasing need for data privacy and compliance (like GDPR or HIPAA), many companies face challenges identifying sensitive information in user-submitted content. We wanted to build a simple, scalable, and cost-effective way to automatically detect personally identifiable information (PII) in uploaded files without requiring a dedicated security team.
What it does
This project provides a serverless API that allows users to upload text files. Once uploaded, the file is analyzed using AWS Bedrock Guardrails to detect any sensitive PII (e.g., names, ID numbers, phone numbers). The detection results are returned via a structured response and logged in Amazon DynamoDB for auditing and traceability.
How we built it
- FastAPI was used to define the API logic and endpoints.
- Mangum allows FastAPI to run inside an AWS Lambda function.
- AWS Lambda executes the API in a serverless environment.
- API Gateway (HTTP) exposes the endpoint securely.
- AWS Bedrock Guardrails analyzes the file content for PII.
- Amazon DynamoDB stores audit logs including timestamp, filename, file size, and detected entities.
- AWS SAM (Serverless Application Model) handles infrastructure-as-code and deployment.
Challenges we ran into
- Tuning the Guardrail response parsing to extract meaningful PII metadata.
- Managing environment variables and credentials securely between local and cloud environments.
- Handling file encoding and file size validations consistently across all flows.
- Integrating Bedrock Guardrails with limited documentation for fine-grained control over assessments.
Accomplishments that we're proud of
- Built and deployed a fully serverless pipeline using modern AWS services.
- Achieved under-1-second inference time for typical PII scans.
- Logged all results to DynamoDB with traceability and audit-ready structure.
- Designed the solution to be easily extendable for other content policies beyond PII.
What we learned
- How to use AWS Bedrock Guardrails in a real-world scenario.
- How to integrate FastAPI into Lambda using Mangum effectively.
- Best practices for lightweight serverless architecture with rapid iteration.
- The importance of structured error handling and observability in serverless pipelines.
What's next for PII Scanning Pipeline with AWS Lambda and Bedrock Guardrails
- Add support for PDFs and image-based content with OCR integration.
- Integrate notification systems (e.g., email or Slack alerts) for flagged uploads.
- Create a dashboard UI for viewing scan history and file audit trails.
- Allow organizations to upload custom guardrails for internal policies.
- Package it as an installable solution on AWS Marketplace or GitHub template repo.
Built With
- amazon-web-services
- bedrock
- boto3
- dynamo
- fastapi
- lambda
- magnum
- python
Log in or sign up for Devpost to join the conversation.