Problem: as compute becomes cheaper and cheaper running hundreds of agents is now bottlenecked on being able to clone git repositories and create git worktrees.

sui makes it possible to clone git repositories in seconds and create new worktrees in milliseconds.

How does it work?

sui is a prefetching virtual file system that is able to achieve this performance by lazy-loading the content of repositories and using a shared store across worktrees.

git has to download the entire repository history whereas sui downloads what your agent needs first enabling the agent to start working before everything is fully downloaded.

This also means that sui makes it possible to work with repositories that are much larger than what was previously possible with git. Because files are only downloaded as they are needed it's possible to have files that are multiple gigabytes without having to pay the storage or bandwidth cost.

Benchmarks

Cloning linux kernel

git: 82 seconds for shallow clone (no history) and 3600+ seconds for full clone (the default option)

sui: 1 second

Creating a new worktree

git: 2GB of storage, 20 seconds

sui: 1 kB of storage, 150 milliseconds.

Challenges and next steps

One of the most difficult machinery in sui is its prefetching algorithm that predicts what files are going to be accessed by agents. It currently still needs a bit of tuning and the downloaded still is not anywhere near maximum theoretical performance because it has to download files 1 by 1. In the future it should be able to download multiple files together to avoid redundant back and forth between client and server.

sui also relies on a recent addition to the git protocol which let's a client query the size of a git object. Unfortunately popular git forges like GitHub and GitLab do not yet have this command enabled which means that sui has to download the whole file to find out how large it is.

The way sui is able to get all that performance when cloning is that it downloads as few files as possible. However the file it does download it currently downloads very slowly, I haven't spent time optimizing that section but speeding that up should help a bunch. It was actually much faster a couple of hours before project submission such that you would be able to list the files as soon as you cloned it but there was a regression which means it now takes a couple of seconds.

One of the cool things about sui is that I was able to use it to develop sui and run many agents simultaneously.

Built With

Share this project:

Updates