ZK Vector DB (ignore)

This is a proof of concept for a decentralized vector database. Web2 AI applications would be wildly inefficient if reproduced on-chain; instead, we use SP1 to create proofs for the execution. SP1 made it easy to generate and verify ZK proofs with minimal understanding of the topic!

Motivation

Existing market options and drawbacks:

Perplexity.ai is an amazing AI-powered search engine

TLDR: an LLM that is focused on summarizing relevant websites and cites its sources
But it's centralized. Why is this an issue?
Lack of privacy; web search data is often a cornerstone of demographic modeling for big tech
Reliance on the company to maintain data quality and continually process new websites

Privacy-focused search engines: decentralized ones like Presearch, self-hosted ones like Searx, etc.
- Not popular, often due to poor results quality: low relevance or not returning latest websites

The end goal for this PoC is a decentralized version of Perplexity, which simultaneously solves the problems in the prior options.

Proposal

The project is split into two components:

Self-hosted LLM
- users are free to use whatever models they want to do summarization, including self-hosted open-source models like Llama
- due to the nature of the UX, half of the results quality concerns are handled by the LLM itself (fixable in prompt engineering)
Decentralized vector database
- a public DB for website results, for the users
- assuming appropriate incentives
- users can freely host nodes
- users can contribute new websites to the DB, addressing the concerns with data availability

This repo contains a webapp to demonstrate the end use case with the LLM, as well as a simple vector database built in Rust.

Built With

express.js
javascript
ollama
react.js
rust
sp1

Updates

Sailesh P started this project — Aug 17, 2024 06:36 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.