LLeMur

About the project

We built LLeMur around a simple idea: AI inference should be able to work like a peer-to-peer marketplace instead of depending entirely on centralised APIs. Instead of sending every request to one hosted service, providers share local model capacity, consumers discover them dynamically, and payments happen directly as part of the network flow. The motivation came from the belief that AI infrastructure is becoming more powerful, but also more centralized, and we wanted to build a system that feels more open, local, and cooperative.

The project lets one machine act as a provider serving QVAC-backed models while another acts as a consumer through a local daemon with Ollama-compatible and OpenAI-compatible chat endpoints. Providers announce which models they can serve, consumers discover them over the network, and requests are delegated to an available peer. Just as importantly, those roles are not fixed: the same person can be both a provider and a consumer, contributing compute in one moment and using someone else's compute in another. On top of inference, the system includes a ledger flow for payments and a ratings flow for reputation, so the network behaves like a real marketplace rather than just a routing layer.

Under the hood, we used QVAC for delegated inference, Hyperswarm for peer discovery, Pear/Bare for the runtime and CLI, and Hypercore/Autobase-style local storage for ledger and ratings data. A chat request comes into the local daemon, the daemon finds a matching provider, creates a signed transfer proposal, waits for a signed acceptance, and then routes the inference request to that peer. This gives the system a complete flow covering discovery, pricing, execution, settlement, and feedback.

One of the most important parts of the project is the ledger design. Payments are not just local counters or loose acknowledgements: ledger events are signed, validated, and stored as structured records. That means transfer proposals and acceptances can be verified, which makes the ledger much safer and more trustworthy than an unsigned credit system. In practice, that signed flow is what lets the marketplace handle value exchange in a way that is consistent with a decentralized network.

One of the biggest challenges was making all of these moving parts work together cleanly. Peer-to-peer discovery is less predictable than calling a standard API, and coordinating runtime setup, provider selection, signed payment events, and model execution introduced a lot of edge cases. We also had to think carefully about trust: if a system is going to behave like a marketplace, then settlement and reputation need to be treated as first-class parts of the architecture, not as afterthoughts.

We learned a lot about distributed systems, developer experience, and trust. The biggest takeaway was that inference is only one part of the problem. A real compute marketplace also needs discovery, pricing, settlement, and reputation to work together in a way that still feels usable for developers. That is what we are most proud of: LLeMur is a peer-to-peer AI marketplace where model serving, signed payments, and reputation are all built into the network itself.