I was inspired by Louis Guthmann at EthCC who said that even the biggest Zero-knowledge companies do not use GPU's to accelerate their Proofs. I thought that we can easily change that by executing the slowest (and highly parallelizable) on a graphics card.

What it does

GPU-Snark executes the Fast Fourier Transformation on Finite Elements for Snarks. It is several times faster than the existing CPU implementation Libsnark (used by ZCash). It is fully templated, so it can calculate different datatypes. The Interface matches the interface of Libsnark and Bellman which makes it easy to use GPU-Snark instead of Libfft for the fourier transformation. It can also be used for DIZK The implementation is in CUDA so it can only run on NVIDIA Graphics Cards.

How I built it & Challenges I ran into

I've started Friday evening and build the first version without compiling it since I do not have a Nvidia GPU in my Laptop. I tried to set up a AWS Student account to test it in the cloud but Amazon was a bit slow. In the end I got access to a university server to test and benchmark GPU-Snark.

What's next for GPU_Snarks

Next Steps for GPU_Snarks are cleanup of the code, with more time I can probably make it even more efficient, Proper integration into [Libsnark] and Bellman. Porting the code to OpenCL. Proper Open Source Licensing. Looking for other parts in Snarks that can be executed in a highly parallel fashion.


My benchmarks were run on a CPU Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz runs with 8 threads

GPU: GTX 1080

Constraints Constraints GPU CPU
2^16 65536 0.32 s 0.21 s
2^20 1048576 0.33 s 4.40 s
2^22 4194304 0.43 s 18.03 s
2^25 33554432 10.57 s tbd*
2^28 268435456 227.25 s tbd*

2^28 the CPU implementation was not finished after over 4 hours on 8 cores. Which suggests a speedup of over **60x* for the GPU implementation.

2^28 took around 1.8GB of Memory on the GPU

Built With

Share this project: