A company working in the security token space was complaining about KYC whitelists and how they revealed a lot of information about who was participating in the sale and the ownership percentages post-sale. I thought a bit about how to solve this and came up with using a Bloom filter to anonymize the whitelist. At the same time, I realized that using a Bloom filter would also greatly reduce the cost associated with doing on-chain KYC.
As governments begin to scrutinize ICOs more closely, token issuers have begun to realize the importance of stronger KYC/AML procedures to prevent unknown or unapproved entities from acquiring tokens. The ability to prevent such entities from acquiring tokens is especially acute for security tokens which are explicitly assumed to fall under regulatory oversight. Proposed protocols such as Zeppelin's Transaction Permission Layer (TPL) offer a framework for controlling token transactions and modeling regulations in smart contract code.
The ability to restrict token sales and transactions to KYC'ed addresses is typically implemented by publishing a "whitelist" of approved addresses to the chain. However, this approach suffers from two major drawbacks. First, the cost of publishing addresses to the blockchain is quite high; as an example, whitelisting just 7473 addresses cost Bluzelle 9.345 Eth (over $11k at the time). Second, publishing a whitelist forces all approved participants to be publicly known ahead of time. This is problematic for token holders who wish to spread their holdings over multiple addresses or those who wish to remain anonymous before the token issuance has started. This problem is especially acute for security tokens, where ownership percentages, transfers, or the total number of holders can convey important market information.
To ameliorate both of these problems, we propose nameless-kyc, a technique for efficiently whitelisting large numbers of addresses using Bloom filters, reducing cost by over an order of magnitude. Moreover, the use of Bloom filters effectively anonymizes the whitelist, preventing public disclosure of the approved addresses and the exact size of the whitelist.
We include in this code repository a working implementation of a
TransactionChcker on the TPL protocol, restricting on-chain transfers of a hypothetical ERC20 token to a set of whitelisted addresses. Our results demonstrate large cost savings; storing 7500 addresses costs only 0.63 Eth, with a small probability of false positives. We believe nameless-kyc can efficiently whitelist potentially hundreds of thousands or even millions of addresses.