Knowledge Vault

Human expertise infrastructure for the agentic web — encrypted, Lightning-gated, owned by you.


Inspiration

Every large language model was trained on human expertise. The people who wrote the papers, built the frameworks, and made the judgment calls got paid once — if at all. The model earns billions. They moved on.

At the same time, AI agents are becoming capable of searching, reasoning, and transacting autonomously. But they still hit a wall: public data is abundant, calibrated private judgment is not. An agent can find anything that's been published. It can't find what an expert knows but never wrote down.

We kept coming back to one question: what if the scarce resource isn't compute — it's context?

The second piece clicked when we looked at L402. The protocol lets a server return a Lightning invoice as the authentication credential. No accounts, no OAuth, no platform. An agent can pay for something and prove it paid, cryptographically, in a single round trip. That means machines can pay for private expertise the same way they call an API — autonomously, instantly, without a human in the loop.

That combination — encrypted private knowledge plus machine-native micropayments — is what Knowledge Vault is built on.


What It Does

Knowledge Vault turns an expert's private corpus into a sovereign, Lightning-gated service that AI agents can query and pay for directly.

The expert runs a local server. Their documents — research notes, frameworks, unpublished analysis — are encrypted with Fernet symmetric encryption and stored on their own machine. Nothing leaves until a query is paid for.

When an agent wants to query a vault node, it hits an L402-gated endpoint. The server returns a Lightning invoice. The invoice is the access credential — paying it is how the agent proves intent and funds the query. The agent pays autonomously. No human approval needed.

Once payment is confirmed, and only then, the Vault Guardian decrypts the relevant documents to RAM, builds an ephemeral index, and runs the query through six quality gates before answering. When the query is complete, RAM is cleared. The documents never persist in plaintext.

The response includes a confidence score. High confidence returns a grounded, cited answer. Low confidence triggers escalation — the agent is directed to a human session, and the micropayment becomes the funnel entry into a consulting relationship.

The expert keeps 100% of the payment. No platform cut. No lock-in. The payment hashes accumulate as a portable, cryptographically verifiable reputation — proof that agents valued your expertise enough to pay for it.


How We Built It

The stack was chosen to prove the core loop with minimum infrastructure:

Backend: FastAPI on Python 3.10+, exposed via ngrok — no cloud server needed for the demo.

Encryption: Fernet symmetric crypto. Documents are stored as .enc files. The vault key lives in config/vault.key and never touches the index or the network.

Vector store: ChromaDB with a persistent local index. The index is built from document chunks at setup time; the documents themselves stay encrypted until query time.

Payment layer: LNbits for invoice creation and payment confirmation via webhook. LNbits was chosen for its simple REST API and zero KYC friction in a demo environment. The L402 middleware intercepts unauthenticated requests and returns the invoice alongside a WWW-Authenticate header.

AI layer: Claude Sonnet for answer generation, Claude Haiku for screening queries through the intent classifier (Gate 1). Anthropic's API handles both.

The six quality gates:

  1. Intent classification — blocks manipulation, injection, and exfiltration attempts
  2. Keyword screening — catches private-topic abuse and prompt injection patterns
  3. Relevance scoring — rejects queries with weak retrieval matches from the corpus
  4. Grounding verification — withholds answers not supported by the retrieved documents
  5. PII screening — removes sensitive data from outputs
  6. Confidence calibration — decides whether to answer, withhold, or escalate

Payment replay prevention: each payment hash is consumed on use. Replaying a settled invoice returns a rejection.

Node structure: the vault supports multiple knowledge nodes under one server — ai_safety, data_science, bitcoin_agentic, local_knowledge — each with its own encrypted corpus, price in sats, and escalation conditions.


Challenges We Ran Into

Making payment-as-authentication feel real, not bolted on. L402 is elegant on paper but the client flow — receive invoice, pay, prove payment, get answer — is unfamiliar. We spent significant time making the sequence feel inevitable rather than awkward, and writing the middleware so the payment hash genuinely gates decryption rather than just gating a flag.

The six gates without hallucinating false positives. A vault that blocks everything is useless. A vault that answers everything defeats the purpose. Calibrating the gates so that legitimate in-corpus queries pass and exfiltration or out-of-scope queries fail — without over-blocking — required iteration on the intent classifier prompts and the relevance thresholds.

Keeping the corpus genuinely private. The temptation in a demo is to cut corners on encryption. We deliberately did not. Documents are never decrypted to disk, the index contains only embeddings not source text, and the key is never logged or exposed through any endpoint. Making sure this held under all code paths was careful work.

Escalation as a feature, not a failure. Early versions treated low-confidence answers as errors. Reframing escalation as the product — the payment becomes a qualified lead, not a failed query — required rethinking the response schema and the UX narrative.


Accomplishments That We're Proud Of

The core loop works end-to-end: an agent hits the endpoint, receives an invoice, pays in sats over Lightning, and gets a grounded answer from an encrypted corpus — or an honest refusal — with no account, no OAuth, and no platform in the middle.

The six-gate pipeline genuinely withholds. In live testing, exfiltration queries, injection attempts, and out-of-scope questions all fail cleanly. The vault knows when not to talk.

Payment hashes accumulate as reputation. Every settled invoice is a permanent, verifiable record that someone paid for this expertise. That's a reputation signal no platform can manufacture, and no platform can take away.

The architecture is intentionally minimal. A laptop, ngrok, and an LNbits instance are enough to run a vault node. The barrier to entry for an expert who wants to sell private judgment to agents is lower than anything that exists today.


What We Learned

Context is the scarce resource, not compute. Skills are commoditized. What an expert knows but never published — the pattern they've seen fail five times, the framework they developed for a specific class of problem — is genuinely non-reproducible from public data. That's what agents will pay for.

L402 is a better fit for agentic payments than most people realize. The framing of "payment as authentication" solves a real problem: how does an agent prove it's serious without an account or a human in the loop? The payment hash is the credential. That's elegant, and it's underused.

Honest limits are a product feature. A system that refuses to hallucinate is more valuable than one that always produces an answer. The escalation path — where the vault says "I can't answer this from my corpus, talk to the human" — is where the real economic value is. The 500-sat query is the funnel entry. The $300 consultation is the business.

Minimal architecture is a design choice, not a constraint. Deliberately avoiding Docker, a dedicated Lightning node, and cloud infrastructure for the hackathon build made the demo more credible, not less. The thesis proves faster when you can't hide behind infra complexity.


What's Next for Knowledge Vault

Multi-expert routing. Right now each vault is a single expert's node. The next step is a discovery layer where an agent can find the right vault for a given query across multiple experts — routing based on node descriptions, pricing, and reputation signals built from payment history.

Richer reputation primitives. Payment hashes are a start. Next: query volume, repeat rate, escalation rate, and endorsements from other vault operators. A portable reputation graph that travels with the expert, not with the platform.

Cleaner onboarding. The current setup requires comfort with Python and a local server. A simple dashboard for encrypting documents, setting prices, and reading query logs would lower the bar significantly for non-technical experts.

Stronger identity layer. Connecting vault nodes to a verifiable identity — a Nostr key, a DID, or a Lightning address — so agents can trust they're paying the right person.

Agent-native discovery. An MCP server or well-known endpoint schema so that AI agents can find and negotiate with vault nodes without human configuration. The long-term vision is a network where any agent, anywhere, can find and pay for the right human judgment — permissionlessly, on open rails.

Built With

  • chromadb
  • fastapi
  • fernet
  • l402
  • lightning
  • lnbits
  • ngrok
  • protocol
  • python
  • sentence-transformers
  • sonnet
Share this project:

Updates