DocShield

Inspiration

The NJX Hackathon challenged us to build technology that protects people and their data from AI-driven risks. We focused on a growing problem: employees upload contracts, reports, and client files into AI tools, but the policies protecting those documents rarely travel with them.

Enterprises currently rely on separate tools for classification, signatures, secure sharing, and data-loss prevention. Once a document crosses company boundaries, those systems stop communicating. We created DocShield to keep a document’s identity, integrity, and AI-use policy attached wherever it travels.

What it does

DocShield uses a two-pronged approach:

  1. Tamper-evident document passport: A hidden encrypted watermark carries the issuer’s identity, document fingerprint, signing history, and AI-use policy. DocShield can verify who issued the document, whether it changed, and whether actions such as external AI uploads are permitted.

  2. Anomaly detection: Secure links record access behavior and flag suspicious activity, including download spikes, blocked attempts, new geographic locations, and access from multiple clients.

Organizations can also revoke documents, restrict access, review lifecycle history, and export audit evidence.

How we built it

We built DocShield as a React and TypeScript web application backed by a FastAPI service.

The browser calculates a SHA-256 fingerprint for each PDF or DOCX and signs a canonical manifest using Ed25519. The backend verifies the signature and stores the document’s policy and cryptographically linked lifecycle events.

For protected downloads, DocShield encrypts the passport using AES-256-GCM and embeds it into the file without changing the original document bytes. During verification, the system extracts the passport, recalculates the fingerprint, verifies the issuer and history signatures, and checks the requested operation against the document’s policy.

Our anomaly engine analyzes eight behavioral signals and combines statistical deviation with a small PyTorch autoencoder to generate a risk score and human-readable reasons.

Challenges we faced

The largest challenge was separating what a portable document can prove from what a controlled sharing system can enforce.

A watermark cannot prevent every screenshot, printout, format conversion, or deliberately reconstructed copy. Similarly, an AI-policy tag cannot block an arbitrary application unless that application or gateway honors it. We learned to describe DocShield precisely: the passport is tamper-evident, while stronger access control and monitoring come from secure links.

We also faced challenges around canonicalizing data consistently across TypeScript and Python, maintaining a valid cryptographic event chain, embedding metadata without modifying the original fingerprint, and explaining anomaly scores without presenting them as proof of malicious behavior.

What we learned

We learned that the biggest opportunity is not another isolated document-security tool. It is connecting the fragmented systems enterprises already use.

DocShield should complement products such as Microsoft Purview, Adobe signatures, secure data rooms, and AI gateways. The document supplies a trusted identity and policy; the existing enforcement point makes the decision.

We also learned that security products depend as much on clear guarantees and low user friction as they do on cryptography. Recipient verification must be simple, policies must be understandable, and every alert must explain why it was raised.

What we are proud of

We completed an end-to-end working prototype that can:

  • Register organizations and signing keys
  • Protect PDF and DOCX files
  • Verify issuer identity and document integrity
  • Carry machine-readable AI-use policies
  • Create revocable secure links
  • Detect suspicious access behavior
  • Display explainable alerts
  • Export document and access audit history

What is next

Our next step is a design-partner pilot connecting one document issuer, one external recipient, and one AI gateway.

We also plan to add enterprise identity, customer-managed signing keys, customer-hosted document processing, standards-compatible provenance, and integrations with existing DLP and AI-security products.

Our long-term vision is simple: every document should be verifiable before AI can act on it.

Built With

Share this project:

Updates