nights watch website homepage
popup box

Project Story

Inspiration

As both corporations and individuals increasingly adopt Gen-AI tools like ChatGPT, Gemini, and Claude, a critical challenge emerges: preventing accidental exposure of sensitive data through clipboard operations. Employees may inadvertently paste confidential information, PII, or trade secrets into these AI platforms, creating significant compliance and security risks for organisations. Similarly, in personal usage, users may unknowingly share private information—such as financial details, health records, or personal messages—while interacting with these tools. This raises serious concerns around privacy, data security, and misuse of personal information, highlighting the need for robust safeguards to ensure sensitive data remains protected in all contexts.

What it does

Nights watch redacts the sensitive data in the clipboard, ensuring it is never gets exposed to external systems.

How we built it

We identify specific events that indicate a user's interest in utilising a tool like ChatGPT. Upon detection, we analyse the clipboard content to identify sensitive information and apply masking as needed. Using the Prompt API, we generate sanitised text, which is then seamlessly placed back into the clipboard for the user's convenience.

Challenges we ran into

Challenges We Encountered While exploring solutions, the Rewriter API initially seemed like a promising choice for our task. However, we faced significant challenges during implementation.Moving towards a better choice being Prompt API. Response Time: The API's response time was too slow for real-time usage. To mitigate this, we started processing clipboard updates as soon as they were detected in the context of a browser tab. Unfortunately, this approach often resulted in incomplete data for accurate predictions. Dynamic Masking Needs: Masking requirements vary based on the user's intent. For example: Case 1: A product manager analysing shopping trends across genders doesn't require masking of gender or order price but must mask name and address. Case 2: When analysing geographic trends, name and gender should be masked, but address and order details must remain unaltered. Achieving tailored masking requires understanding both the dataset (clipboard data) and the user's query context from the website. Query Context Dependency: Simply relying on clipboard data isn't sufficient. Incorporating the user's typed query to dynamically decide which columns to mask is crucial, but accessing and combining this data in real time poses technical challenges. Selective and Real-Time Processing: To address these challenges, we're exploring a multi-step approach: Initial Processing: Perform a general sanitisation of clipboard data when it’s first added. Selective Processing: Refine masking based on the user’s query when they press enter to submit data. Data Re-masking and Unmasking: Another layer of complexity arises when users fetch data back from the website. Ensuring the data is properly unmasked for authorised use while maintaining privacy controls is tricky and requires careful handling. We are actively working on faster alternatives to improve real-time performance and align masking strategies dynamically with user intent.

Accomplishments that we're proud of

Within a limited timeframe, we successfully developed a robust proof-of-concept that incorporates all the essential functionalities for basic usage. This solution is poised to save significant time for users who currently clean data manually before inputting it into Gen-AI tools. Additionally, we've identified several potential enhancements that will make the extension even more intuitive and user-friendly, paving the way for a seamless and efficient experience.

What we learned

We discovered the importance of leveraging smaller, on-device models to ensure low-latency performance while safeguarding sensitive data without relying heavily on external APIs. Dynamic masking tailored to user intent and query context emerged as a critical requirement, highlighting the need for flexibility in processing workflows. Balancing real-time processing with accuracy taught us the value of multi-step pipelines, combining initial sanitisation with selective refinement. We also faced challenges in managing the lifecycle of sanitised data, particularly for re-masking and unmasking, reinforcing the importance of robust safeguards. Finally, we learned that usability and security must go hand-in-hand to create a seamless, privacy-first user experience.

What's next for Night's watch

Looking ahead, we plan to refine and optimise Night's Watch by using smaller on device models to solve easier subtasks to ensure even faster, more accurate real-time performance. This will help reduce reliance on external APIs, ensuring both privacy and efficiency. We're also focusing on improving our dynamic masking capabilities to get more relevant masks. We also need to work on our unmasking logic for more stable results. We're exploring ways to integrate Night's Watch with other privacy-focused solutions and platforms, expanding its usage and reach while maintaining its user-friendly design. Ultimately, the goal is to provide a seamless, comprehensive tool that ensures sensitive data is always protected across a range of environments, both in corporate and personal contexts. We have also planned to add a summary block, which provides the user a feedback on the common mistakes he/she commits from a privacy perspective while interacting with these tools.