GIF
Video Demonstration
GIF
UI/UX Demonstration
Digital Twin chrome extension automatically identifying and replacing sensitive personal information

DigitalTwin

Inspiration

As AI chatbots become part of daily workflows, users risk leaking sensitive personal information (PII) such as emails, addresses, and identification numbers.

What it does

DigitalTwin mitigates this by detecting and anonymizing PII in real time with a fine-tuned DistilBERT model before messages are sent to cloud-based AI systems. This leverages AI to defend user privacy and security while enhancing the privacy of said AI systems.

How it works

The DigitalTwin extension identifies any and all text input areas on the screen, attaching itself to it. It stores the input text into chrome storage along with the user's preferences on what information should be censored, sending it to our fine-tuned DistilBERT model which detects the sensitive information present in the text and replaces it with text with similar semantic value. This text is then able to replace the user's text in real-time before the user divulges any sensitive information accidentally.

Key Features:

Real-time PII detection in chatbot inputs (ChatGPT, Claude, Gemini, Bing, and more)
Color-coded highlighting of detected PII entities with badges and visual indicators
Fake data replacement with synthetic values (configurable by entity type)
Persistent Replace Button with turquoise/pink scheme for one-click anonymization
Dual-layer detection: Hugging Face Piiranha AI model + regex fallback
Singapore-optimized patterns: NRIC, phone numbers, addresses, common local names
Chrome storage integration for settings and detection logs
Non-intrusive overlay system with popup interface for detections
Debounced detection pipeline for smooth, performance-friendly monitoring

How we built it

Backend Development

Python 3.11/3.12 – Core backend runtime
FastAPI – Modern web API framework
Uvicorn – ASGI server for FastAPI
pip + venv – Dependency isolation

Frontend Development

Chrome Extension Manifest V3 – Extension framework
Vanilla JavaScript (ES6+) – Lightweight, no frameworks
CSS3 & HTML5 – Styling and popup/test UI

Development Environment

Git – Version control
WSL2 – Local dev environment
Node.js ecosystem – Minimal usage

Libraries Used

Backend Libraries

fastapi – Web API framework
pydantic – Data validation and serialization
uvicorn – ASGI server
transformers – Hugging Face model integration
faker – Synthetic data generation
torch – PyTorch for model inference (required by transformers)

Python Standard Library

re – Regex pattern matching
logging – Application logging
typing – Type hints
json – Data processing

Frontend Libraries

No external JS libraries – Pure vanilla JavaScript
Chrome Extension APIs – Native browser APIs

Web APIs:

DOM manipulation
Fetch API for backend comms
MutationObserver for dynamic content monitoring
Range API for precise text highlighting

Challenges we ran into

User friendly UI that does not compromise functionality of the digital platform the user is on
Fine Tuning the DistilBERT model to increase the accuracy of PII detection
Handle edge cases to increase the accuracy of PII detection

Accomplishments that we're proud of

Our AI model managed to identify 17 personal information fields with up to $99.4\%$ accuracy.

Automated text replacement with a click of a button, which allows users to seamlessly change their vulnerable input that may expose personal data into a secure one.

What we learned

How to work collaboratively
Fine tuning of AI models
Working with PyTorch
Creative Thinking and impact potential

What's next for DigitalTwin

To be extended beyond AI chatbots to all digital platforms such as social media platforms where user post captions/comments or online collaborative tools (e.g. Google Docs, Microsoft Word, Google Sheets, etc.) where users may be at risk of exposing their sensitive personal information.