An article to understand privacy and regulatory issues on the blockchain

In the recent article "[Researcher's Perspective] Blockchain: From Entry to Mastery", Yan Ying and Chen Yang, researchers from the blockchain project of Microsoft Asia Research Institute, presented the second development session of Ethereum held in Shanghai in September. As an entry point, he shared with us the current cutting-edge trends and trends of blockchain technology. This article was also well received by industry researchers. Therefore, they brought their second sharing. This article mainly discusses the recent hotspots of the blockchain—how it protects privacy as a public ledger. Come and take a quick look!

       Blockchain—as a public account book, solves the problem of how to establish trust between parties, but it also brings a new problem: how to protect privacy? When all transaction information of users is exposed to the public, if the transaction information is maliciously mined and utilized, it will pose a serious threat to user privacy. This article will give you a vivid introduction and analysis of privacy issues in blockchain technology and cutting-edge solutions.

Blockchain privacy concerns

       Maybe everyone will have a doubt first: Isn't blockchain technology such as Bitcoin "anonymous", why is there a privacy issue? To answer this question, we must first distinguish between the two concepts of "pseudonym" and "anonymity". A pseudonym is easy to understand, it is an identity that we use on the Internet that has nothing to do with our real identity. For example, in the transaction of the Bitcoin system, the user does not need to use the real name, but uses the hash value of the public key as the transaction identifier. In this example, the hash value of the public key can represent the identity of the user and has nothing to do with the real name, so Bitcoin is pseudonymous.

       But anonymity is not the same as a pseudonym. In computer science, anonymity refers to an alias with unlinkability [1]. The so-called irrelevance means that from the perspective of the attacker, it is impossible to associate any two interactions between the user and the system. In Bitcoin, since users repeatedly use public key hashes as transaction identifiers, transactions can obviously be associated. Therefore, Bitcoin is not anonymous.

       If a transaction from a single address does not ensure anonymity, what about multiple addresses? The answer is still no: as shown in Figure 1, user X uses multiple accounts to transfer money to user Y within a certain period of time, the attacker can guess with a high probability that these addresses belong to the same user, and the multiple addresses are all Grouped into an address cluster.

Figure 1: Multi-account and single-account transactions are linked

       In addition to the above, the change address will also expose the relevance of the user's address. As shown in Figure 2, User X transfers 40 to User Y for a total payment of 50, so the change returns 10. The attacker will have a high probability to speculate that account D is a change address, thus associating accounts D and E. In the early Bitcoin-Qt library, there was a privacy problem caused by the change address always appearing first in the output address (it was fixed in 2012).

Figure 2: Change account and other accounts will be linked

       After merging the multiple addresses of the user into address clusters, and then combining the transactions that occur directly in reality to add labels to the address clusters, the label cluster diagram shown in Figure 3 can be drawn. The connecting line in the figure represents a transaction, and the size of the circle represents the transaction volume. Although only the labels of service providers, exchanges, and mining pools are listed in the figure, it should be noted that similar means can also be used to obtain identity information of individual users in real life. Therefore, combined with the address information of the service provider and the public ledger, all consumption records of individual users will be fully revealed. This poses serious user privacy concerns.

Figure 3: Building label clusters through actual transactions and address clusters [2]

       How to realize the characteristics of blockchain (transaction verifiable, history checkable, etc.) while ensuring privacy (hiding transaction content)? Below we introduce the three most typical anonymization schemes: Dash, Monero, and Zcash.

Dash

       Dash uses a key technology called CoinJoin. Simply put, the so-called combined currency technology is a technology that uses some master nodes (master nodes) to mix multiple transactions of multiple users (at least 3) to form a single transaction. In Hebi, each user will provide an input and output address, and then send it to the master node for mixing (that is, exchange input and output addresses arbitrarily). Transactions can only be carried out in units of specified denominations (0.1, 1, 10, 100), which increases the difficulty for attackers to guess the degree of transaction association from the perspective of amount. At the same time, the master node must ensure out-of-order output. As shown in Figure 4, different colors represent that the amount comes from different users, and DASH is the currency identifier of Dashcoin. Through mixing, the yellow user completes the transfer of 10 DASH to the green user. It is difficult for the outside world to discover this transaction from the obfuscated transaction.

Figure 4: Combined currency technology can mix multiple transactions of multiple users[3]

       合币中一个关键的保护隐私的角色就是主节点,因为主节点依然存在被攻击者控制的可能性。为了解决这个问题,达氏币中引入了链式混合(chaining)以及盲化(blinding)技术。所谓链式混合,就是指用户的交易会随机选择多个主节点,并在这些主节点中依次进行混合,最后输出;所谓盲化技术,就是指用户不直接将输入输出地址发送到交易池,而是随机选择一个主节点,让它将输入输出传递到一个指定的主节点,这样后一个主节点就很难获取用户的真实身份。通过这两个技术,除非攻击者控制了很多的主节点,否则几乎不可能对指定交易进行关联。

       除了防范交易数额以及输入输出地址的关联攻击,达氏币还防范了交易时间上的关联攻击。每个用户往往都会具备自己的交易习惯,例如每天的交易时间段以及短时间内进行多笔交易等等。这些时间信息也会一定程度暴露用户身份。为了解决这个问题,达氏币提出了被动(passive)匿名化的方案,保证用户客户端以固定的时间间隔发起交易请求,来参与主节点的混合。

       自从2014年市场化以来,达氏币就颇受市场追捧,截至2016年12月2日,其市值已超过6000万美元,在所有加密货币中排名第7;货币单价为8.85美元,在所有加密货币中排名第4[4]。

门罗币(Monero)

       在达氏币中,依然存在主节点被控制以及参与混币有恶意用户的风险,这在一定程度上会导致用户隐私的泄露。为了解决这个问题,门罗币提出了一种不依赖于中心节点的加密混合方案。门罗币的关键技术有两个,一个叫做隐蔽地址(stealth address),另一个叫做环签名(ring signature)[5]。

       隐蔽地址是为了解决输入输出地址关联性的问题。每当发送者要给接收者发送一笔金额的时候,他会首先通过接收者的地址(每次都重新生成),利用椭圆曲线加密算出一个一次性的公钥。然后发送者将这个公钥连同一个附加信息发送到区块链上,接收方可以根据自己的私钥来检测每个交易块,从而确定发送方是否已经发送了这笔金额。当接收方要使用这笔金额时,可以根据自己的私钥以及交易信息计算出来一个签名私钥,用这个私钥对交易进行签名即可。

       隐蔽地址虽然能保证接收者地址每次都变化,从而让外部攻击者看不出地址关联性,但并不能保证发送者与接收者之间的匿名性。因此门罗币提出了一个环签名的方案——事实上,在古代就已经有类似的思想了:如图5所示,联名上书的时候,上书人的名字可以写成一个环形,由于环中各个名字的地位看上去彼此相等,因此外界很难猜测发起人是谁。

图5:古代联名上书时利用环状签名来保护发起人信息[6]

       那在门罗币中环签名又是如何实现的呢?如图6所示,每当发送者要建立一笔交易的时候,他会使用自己的私钥加上从其他用户的公钥中随机选出的若干公钥来对交易进行签名。验证签名的时候,也需要使用其他人的公钥 以及签名中的参数。同时,发送者签名的同时还要提供钥匙映像(key image)来提供身份的证明。私钥和钥匙映像都是一次一密的,来保证不可追踪性(untracability)。

图6:环状签名可以隐藏交易发起人的信息

        除了交易地址,交易金额也会暴露部分隐私。门罗币还提供了一种叫做环状保密交易(RingCT)的技术来同时隐藏交易地址以及交易金额。这项技术正在逐步部署来达到真正的匿名。这项技术采用了多层连接自发匿名组签名(Multi-layered Linkable Spontaneous Anonymous Group signature)的协议。限于篇幅,感兴趣的读者可以参考原论文[7]。

       门罗币目前的市值超过一亿美元,排名第5;单价为8.21美元,排名第6。

零钞(Zcash)

       门罗币的方案看似已经接近完美,但依然存在一个可能的问题:环签名中依旧需要与其他用户的公钥进行混合,因此可能会遭遇恶意用户从而暴露隐私。零钞利用零知识证明避免了这个问题,让用户只是通过和加密货币本身进行交互来隐藏交易信息,做到了“所有货币生来平等”[8]。

       首先给大家简要介绍一下什么是零知识证明(zero knowledge proof)。如图7的左图所示,北分支和南分支中间有一扇门,老王宣称自己能打开这扇门,如何在不给大家看开门细节的情况下让大家相信这件事呢?假设我们采用交互式的零知识证明,那么老王可以自己选择一个分支进入,如图中蓝色点(哪个分支可以不让验证者知道)。然后验证者(图中红色点)随机指定老王从哪个分支出来。如果老王打不开门,那么老王只有二分之一的概率达到要求。N次重复这个过程,如果老王不能打开门,那么老王N次都能从指定分支出来的概率为 (1/2) ^N。因此足够次数下,如果老王都能完成任务,可以认为老王能打开门。这样带来的一个问题就是交互成本太高。一个改进的方案是图7的右图,假设这儿有100条路,如果验证者随机指定100条路中的一条,老王能从该分支出现,那么老王不能开门的概率为1/100。这显著提高了交互的效率。

图7:零知识证明的一个简单例子。左图代表交互式证明,右图代表非交互式证明

       在零钞的设计中,就采用了一种叫做zk-SNARK的非交互式的零知识证明。在这里我们并不深入zk-SNARK的细节,只是大致描述一下零钞中是如何使用这项技术的。首先我们看一个最简单的情形,假设零钞中的币值都是确定的,例如1BTC。那么铸币过程相当于是用户向某个托管池(escrow pool)注入1BTC,然后向一个列表中写入一个承诺(commitment) 。其中承诺必须由一个序列号以及用户私钥才能计算得到,并且是单向的。当用户想要花这个币的时候,需要做两件事:(1)给出序列号,以及 (2)利用zk-SNARK证明自己知道生成这个承诺的用户私钥。这样,用户就可以在完全不暴露身份的情况下,花出这个币。并且序列号的唯一性可以保证没有双花的情况。

       以上的简单情形有三个问题:(1)币值固定很不方便, (2)发送方可以通过序列号来判断接收方正在花钱,(3)接收方必须马上花掉得到的币否则可能被发送方提取。为了解决这三个问题,零钞中提出了一种浇铸(pour)的操作来花销钱币。简单地讲,浇铸操作就是通过一系列零知识证明,将一个币铸造成多个币,且输入输出的总和相等。每个新币都有自己的密钥、数额、序列号等等,从而解决了以上三个问题。与此同时,零钞还采用了一系列的优化措施来提高整个运行系统的性能。

       零钞在现目前所有的密码学货币中是匿名性最好的,因此受到过市场狂热的追捧,在2016年10月底发行前后,单币价格估值曾高达几千个比特币。现目前币值稳定后,零钞的市值大约为800万美元。单币价格为62美元,仅次于比特币。

小结

       区块链技术中的隐私问题一直以来都是饱受诟病的,一方面普通用户在区块链上的交易隐私应该得到保护,另一方面又应该防止恶意用户将其用作非法交易的平台。现目前的匿名化技术也还不能完美地保证匿名,比如像零钞,也必须依赖于初始化时的一些秘密参数(掌握在几个人手中)。这也会给用户带来交易与隐私上的风险。除了交易隐私,诸如以太坊等区块链技术中的智能合约隐私也是一个很值得关注的问题,目前也已经有一些工作开展起来。希望在不久的将来,区块链能做到在保证隐私的同时,为数字世界提供一个公开可信的技术支撑。

作者:张宪 闫莺 陈洋

参考文献

[1] Arvind Narayanan, et al. “Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction”,2016

[2] Meiklejohn, Sarah, et al. "A fistful of bitcoins: characterizing payments among men with no names." Proceedings of the 2013 conference on Internet measurement conference. ACM, 2013.

[3] https://github.com/dashpay/dash/wiki/

Whitepaper

[4] https://coinmarketcap.com/

[5] https://getmonero.org/home

[6] http://www.nihonkoenmura.jp/theme3/

takarabito07.htm

[7] Noether, Shen. "Ring signature confidential transactions for monero."

[8] https://z.cash/

Guess you like

Origin blog.csdn.net/weixin_45709013/article/details/128993589