Inspiration

The project was inspired by a recurring problem observed during due diligence, audits, and fundraising processes. In these scenarios, digital data such as financial records, metrics, and internal documents are treated as factual evidence, yet there is no reliable mechanism to prove that the data has not been altered after being created or shared.

Most organizations rely on centralized storage systems, exported files, or version histories that are fully controlled by the data owner. This creates a trust gap where reviewers must assume honesty rather than verify integrity. While blockchain systems offer immutability, storing complete datasets on-chain is impractical due to performance, cost, and privacy limitations.

This project was initiated to design a system that provides verifiable data integrity while keeping the actual data off-chain and under the owner’s control.

What the Project Does

The project provides a method to prove that a dataset existed at a specific time and has not been modified since. Instead of storing raw data on a blockchain, the system generates cryptographic proofs that represent the data state.

Each dataset is converted into a cryptographic hash, which acts as a unique fingerprint. This fingerprint is timestamped and stored in an append-only registry. Any future version of the data can be verified by recomputing the hash and comparing it with the original proof.

If the data changes, even slightly, the hash changes completely, as shown below:

Hash(D) = Hash(D′)

This allows third parties to independently verify data integrity without accessing the original data.

What We Learned

Through this project, we learned that data integrity can be established without exposing or storing the data itself. Cryptographic hashing can represent large and complex datasets with mathematical certainty.

Key technical learnings included understanding cryptographic hash functions, designing append-only systems, implementing time-based anchoring, managing data versioning, and building verification workflows that do not rely on trust in the data owner.

The project also highlighted the importance of separating data storage from data verification to achieve both privacy and integrity.

How the Project Was Built

The system follows a hybrid architecture combining off-chain processing with on-chain verification.

The implementation flow is as follows:

  1. The dataset is processed locally or on a server
  2. A cryptographic hash of the dataset is generated
  3. Metadata such as size, structure, and version is recorded
  4. The hash and metadata are anchored with a timestamp
  5. The proof is stored in an append-only registry

The blockchain layer stores only verification proofs, while all raw data remains off-chain. Verification is performed by recomputing the hash of the provided dataset and matching it against the stored proof.

Challenges Faced

One major challenge was handling data updates while maintaining trust. Real-world data changes frequently, so the system had to support multiple versions without allowing previous records to be modified or deleted.

Another challenge was minimizing on-chain operations to avoid performance and cost issues while still maintaining strong verification guarantees.

Finally, designing the system to be understandable to non-technical users while preserving cryptographic correctness required careful abstraction and clear verification logic.

Built With

  • abstraction
  • account
  • alchemyapi
  • axios
  • cid
  • deployment)-alchemy-(sepolia-rpc-+-monitoring)-ethereum-sepolia-testnet-identity-&-authentication-magic.link-(passwordless-login
  • dotenv
  • ecosystem
  • ethereum
  • ethers.js
  • express.js
  • fetch
  • git
  • github
  • hardhat
  • ipfs
  • magic
  • magic.link
  • next.js
  • node.js
  • npm
  • pdfkit
  • provider
  • react
  • sepolia
  • signer
  • solidity
  • tailwindcss
  • testing
  • vite
Share this project:

Updates