DigitalTwin
Inspiration
As AI chatbots become part of daily workflows, users risk leaking sensitive personal information (PII) such as emails, addresses, and identification numbers.
What it does
DigitalTwin mitigates this by detecting and anonymizing PII in real time with a fine-tuned DistilBERT model before messages are sent to cloud-based AI systems. This leverages AI to defend user privacy and security while enhancing the privacy of said AI systems.
How it works
The DigitalTwin extension identifies any and all text input areas on the screen, attaching itself to it. It stores the input text into chrome storage along with the user's preferences on what information should be censored, sending it to our fine-tuned DistilBERT model which detects the sensitive information present in the text and replaces it with text with similar semantic value. This text is then able to replace the user's text in real-time before the user divulges any sensitive information accidentally.
Key Features:
- Real-time PII detection in chatbot inputs (ChatGPT, Claude, Gemini, Bing, and more)
- Color-coded highlighting of detected PII entities with badges and visual indicators
- Fake data replacement with synthetic values (configurable by entity type)
- Persistent Replace Button with turquoise/pink scheme for one-click anonymization
- Dual-layer detection: Hugging Face Piiranha AI model + regex fallback
- Singapore-optimized patterns: NRIC, phone numbers, addresses, common local names
- Chrome storage integration for settings and detection logs
- Non-intrusive overlay system with popup interface for detections
- Debounced detection pipeline for smooth, performance-friendly monitoring
How we built it
Backend Development
- Python 3.11/3.12 – Core backend runtime
- FastAPI – Modern web API framework
- Uvicorn – ASGI server for FastAPI
- pip + venv – Dependency isolation
Frontend Development
- Chrome Extension Manifest V3 – Extension framework
- Vanilla JavaScript (ES6+) – Lightweight, no frameworks
- CSS3 & HTML5 – Styling and popup/test UI
Development Environment
- Git – Version control
- WSL2 – Local dev environment
- Node.js ecosystem – Minimal usage
Libraries Used
Backend Libraries
fastapi– Web API frameworkpydantic– Data validation and serializationuvicorn– ASGI servertransformers– Hugging Face model integrationfaker– Synthetic data generationtorch– PyTorch for model inference (required by transformers)
Python Standard Library
re– Regex pattern matchinglogging– Application loggingtyping– Type hintsjson– Data processing
Frontend Libraries
- No external JS libraries – Pure vanilla JavaScript
- Chrome Extension APIs – Native browser APIs
Web APIs:
- DOM manipulation
- Fetch API for backend comms
- MutationObserver for dynamic content monitoring
- Range API for precise text highlighting
Challenges we ran into
- User friendly UI that does not compromise functionality of the digital platform the user is on
- Fine Tuning the DistilBERT model to increase the accuracy of PII detection
- Handle edge cases to increase the accuracy of PII detection
Accomplishments that we're proud of
Our AI model managed to identify 17 personal information fields with up to $99.4\%$ accuracy.
Automated text replacement with a click of a button, which allows users to seamlessly change their vulnerable input that may expose personal data into a secure one.
What we learned
- How to work collaboratively
- Fine tuning of AI models
- Working with PyTorch
- Creative Thinking and impact potential
What's next for DigitalTwin
To be extended beyond AI chatbots to all digital platforms such as social media platforms where user post captions/comments or online collaborative tools (e.g. Google Docs, Microsoft Word, Google Sheets, etc.) where users may be at risk of exposing their sensitive personal information.
Built With
- chrome
- css3
- extension
- faker
- fastapi
- html5
- javascript
- manifest
- python
- torch
- transformers
- uvicorn
- vanilla
Log in or sign up for Devpost to join the conversation.