Inspiration
- Perplexity is a LLM search engine
- Problems:
- lack of privacy from them collecting our data
- reliance on a company to continually provide high quality results
So, let's decentralize the database portion! Assuming sound incentives, this allows users to add web results and host nodes. Eventually, it converges to become the world's single source of truth.
What it does
A vector database stores a key-value pair. The key is a vector, the value can be anything. In this example, the key is an embedding - a bunch of numbers that represent the meaning of some text - the value let's just say is a reference to some website whose data was used to create the embedding The vector DB is designed to perform similarity search and return top-n results extremely quickly.
How we built it
An abstraction of the DB was built in Rust. We used Succinct SP1 to prove that the search executed without any issues. The proof can be verified on-chain, which also means users can interact with a smart contract to use the database while minimizing load on the network.
Challenges we ran into
ZK proofs are tricky to get right with anything related to AI, due to the use of a lot of vectors, floating points, complex calculations (ex. vector cosine similarity). There were also many options that failed due to hooks to various other programming languages, as well as irremovable dependencies on randomness. The easiest route for us was to build a PoC using plain Rust, minimal libraries. Also, we were not that familiar with Rust.
Accomplishments that we're proud of
Despite having bare minimum knowledge of ZK, we successfully got SP1 to generate a proof, and also verify it!
What we learned
Designing a database for a specific use case is beyond the scope of a two-day hackathon.
What's next for ZK Vector DB
Designing a database for the specific use case beyond the scope of a two-day hackathon!
Log in or sign up for Devpost to join the conversation.