2016012030+Wang Chaochao+Application of Hash Function and Its Security

First, the specific application of the hash function

1. Application of one-way hash function in cryptography

A. Digital Signature Technology

a. Use the one-way hash function to calculate the postage of the message that needs to be signed, and then use the signature algorithm to sign the postage instead of directly signing the original message.

b. Effectively improve the efficiency and speed of signatures, reduce the amount of information transmitted, and save network bandwidth.

B. Message Integrity Authentication

a. The usual practice is that the owner of the file uses the Hash algorithm to calculate the Hash value of the file, saves a copy of the Hash value by himself, and then stores the file in a public place. When the integrity of a file needs to be verified, a Hash algorithm is used to calculate the Hash value of the stored file and compare it with the original saved Hash value. If it is equal, it is complete, otherwise, there is a change.

b. In practice, the use of Hash function on the network for integrity identification

This method realizes the integrity authentication on the insecure communication channel and is widely used in the authentication system of electronic commerce.

C. An Improved User Key Management Scheme

a. The user's password is encrypted by the DES algorithm and stored in the machine, but the algorithm has requirements on the length of the encrypted data, that is, the user cannot input an excessively long key.

b. Propose a secure user key management scheme using one-way hash function if K=E(H(P1))

The improved method allows users to input passwords of any length. Due to the one-way nature of the single-item hash function, it can also make up for some of the imperfections of DES, and can also deal with the threat of exhaustive attacks. 2. Other applications

A. Privacy Enhanced Mail PEM (Privacy Enhanced Mail)

The Internet Privacy Enhanced Mail Standard provides single-item hash functions MD2 and MD5 for authentication in message integrity checking. The single-item hash function used in version 2.6.3 of the mail encryption software PGP is MD5.

B. File verification

a. Parity check and CRC check are not resistant to data tampering

b. The "digital fingerprint characteristic" of the MD5 Hash algorithm makes it the most widely used file network shaping checksum (Checksum) algorithm at present

c. The application scenarios are:

After the file is transferred, the target file calculated by the checksum is compared with the source file.

Used as a digital fingerprint to store binary filesystems in order to detect if the filesystem has not been modified without permission

C. Authentication Protocol

There is this application in the pop3 protocol:

The party that needs authentication sends a random string ("challenge") to the authenticated party, and the authenticated party performs a hash operation on the random string and its own authentication password, and returns it to the authenticating party. Compare the received Hash value with the result of Hash operation performed at the own end with the random string and the authentication password of the other party ("authentication"). through authentication.

2. Combined with the birthday attack, and Professor Wang Xiaoyun's MD5 security in 2004 and 2005 and the security of Google's SHA-1 in 2017, explain the security of hash functions and the current development of secure hash functions.

1. Security of Hash Functions

A. MD5 and SHA-1 Algorithms

a. The core of the MD5 and SHA-1 algorithms is the hash function. Cryptography Hash Function (Hash function for short), also known as hash function, is a cryptographic algorithm with extensive and important applications in the field of information security. Its main functions are data integrity verification and message authentication. It has an application similar to fingerprints, so sometimes we also call it "digital fingerprints". Because it has the following characteristics: as long as the original information changes a little, even a few bits, the corresponding message digest will change a lot.

b.2004年8月17日在美国加州圣巴巴拉举行了一次国际密码学学术年会(Crypto’2004),当晚来自中国山东大学的王小云教授做了关于破译 MD5、 HAVAL-128、 MD4和RIPEMD算法的报告。

c.SHA-1已经被公众密码社群做了非常严密的检验而还没发现到有不安全的地方，它在一段时间被认为是安全的，直到2017.02.23，Google宣布攻破SHA-1。

结果显示是目前还不受Google发现的这个碰撞攻击方法影响。SHA-1发布于1993年，至今已经24年，计算机系技术这二十年是日新月异，二十多年已经很了不起了，加密算法都不得不在计算效率与破解难度之间权衡。一般来说十年左右更新一代，继任者sha-2发布于2001年，sha-3发布于2015年。实际的影响应该很有限。对于中国网站来说连https都没部署，大部分还是明文保存密码的，即使一些比较尊重客户的网站，也仅仅使用了早已经被公开碰撞方法的MD５加密方法而已。Google，facebook，微软，苹果等早已经换成了暂时安全的sha-256，sha-512等算法，属于sha-2系列，也已经发布快十五年了。应该很快就可以看到国际主流的网站更换sha-3算法了。

B.关于生日攻击

生日攻击是利用概率论中的生日问题，找到冲突的Hash值，伪造报文，使身份验证算法失效。

防范方法:

a.使用安全的Hash算法：安全的Hash算法生成的Hash值有足够多的位数。这样，攻击者在寻找两个具有相同Hash值的文件时就会非常困难。

b.加盐：在为文件签名之前，先向文件添加一个随机值，然后计算Hash值，再将文件、签名和随机值一起发送给接收者。这样，攻击者必须找出具有特定Hash值的伪造文件，这非常困难。

c.改动文件：在为文件签名之前，对消息或文件做少许改动。这样，攻击者必须找出具有特定Hash值的伪造文件，这非常困难。

C.散列函数的安全性

a.通过对于生日攻击的了解，散列函数的安全性是有待提高的。生日攻击并没有利用任何HASH函数的性质，是对任何HASH都适用的普适的攻击方法，应对方法也很简单，增加HASH的长度，但是很难完成。

b.以为相对于安全的MD5和SHA-1算法，在相继公布被实现碰撞之后，可以预见，之后的算法发现碰撞只是实现计算机更好性能的时间长短而已。

举个例子：中国铁道部的12306使用SHA-1算法，上传Google的验证网站散列函数的安全性

结果显示是目前还不受Google发现的这个碰撞攻击方法影响。SHA-1发布于1993年，至今已经24年，计算机系技术这二十年是日新月异，二十多年已经很了不起了，加密算法都不得不在计算效率与破解难度之间权衡。一般来说十年左右更新一代，继任者sha-2发布于2001年，sha-3发布于2015年。实际的影响应该很有限。对于中国网站来说连https都没部署，大部分还是明文保存密码的，即使一些比较尊重客户的网站，也仅仅使用了早已经被公开碰撞方法的MD５加密方法而已。Google，facebook，微软，苹果等早已经换成了暂时安全的sha-256，sha-512等算法，属于sha-2系列，也已经发布快十五年了。应该很快就可以看到国际主流的网站更换sha-3算法了。

2.安全散列函数的发展

A.MD4

1990年Ronald L. Rivest设计，通过三圈的操作将任意长度的消息变换成128位的哈希值。

B.MD5

Rivest于1991年对MD4的改进版本。运用了四轮变换，并且每轮加上前一轮的结果。

C.HAVAL

为MD5的改进版本。轮数可以为3、4或5，输出长度分别为128、160、192或224位。

D.SHA-1

由NIST开发，1993年发表。输入最大长度为2^64位的数据，输出160位的消息摘要。

E.SHA-256

输出由160位扩大到256位，迭代次数由80次增加到128次。

F.SHA-384

输出扩大到384位，迭代次数增加到192次。

H.SHA-512

输出扩大到512位，迭代次数增加到256次。

三.结合md5算法中的选择前缀碰撞以及第二个链接中的helloworld.exe和goodbyworld.exe两个可执行文件的md5消息摘要值和两个文件的执行结果说明md5算法来验证软件完整性时可能出现的问题。

MD5算法是一种摘要算法，它可以从多个字节组成的串中计算出由32个字节构成的“特征串”。对于超过32字节的串来说，MD5计算得出的值必然是其一个子集，所以必然存在两个（或更多）不同的串能够得出相同MD5值的情况，这种情况就叫做MD5碰撞。

几位密码学家使用 “构造前缀碰撞法”（chosen-prefix collisions）来进行攻击（是王小云所使用的攻击方法的改进版本），他们所使用的计算机是一台Sony PS3，且仅用了不到两天。如果仅仅是想要生成MD5 相同而内容不同的文件的话，在任何主流配置的电脑上用几秒钟就可以完成。

他们的结论：MD5 算法不应再被用于任何软件完整性检查或代码签名的用途。

验证软件完整性时可能出现的问题：

A.文件不完整的情况

a.感染病毒

b.植入木马/后门/人为篡改

c.传输故障

B.可能出现的问题

a.如果有第三方在验证软件完整性时截取软件代码，使用快速MD5碰撞生成器，在短时间内伪造一份相同的MD5，并恶意篡改软件，那么安全性将会大大下降

b.当软件过大时，在验证过程中所需的时间也会大大增加，对于第三方而言，攻击的成功概率也会增加

c.网站链接中的Vulnerability analysis也给出了一些问题分析:

On the other hand, there is the viewpoint of the relying party, i.e. the user downloading hashed or signed code who needs some guarantee that this software can be trusted. This relying party can not be sure anymore that the published hash value or the digital signature is valid for only the executable file he downloaded. There might very well be a sibling file with the same hash value or digital signature, while only one of these siblings went through the proper hashing or signing procedure. Especially when the software integrity verification takes place under the hood, with the user not knowing that the operating system or some hidden application is silently verifying digital signatures on software to be installed, the user may be more easily lured into installing malware.

Note that it is not necessary for an attacker to build both executables from source code. It is perfectly well possible to take as the first file any executable from any source, and as the second file produce a second executable as malware. Then a byte block to be appended to both files can be found such that the resulting files have the same MD5 hash value. If an attacker can then get the first file to be signed, e.g. by the original software vendor, this signature will also be valid for the attacker-constructed malware.

d.即使用户能保证下载的文件的网站是能信任的，网站也不能保证用户下载到本地的文件是正确的。特别是当软件完整性验证发生在电脑主机下时，当用户不知道操作系统或某些隐藏的应用程序在安装软件时默默地验证数字签名时，用户可能更容易被诱骗安装恶意软件。

e.攻击者也不需要从源代码中构造出两个文件，完全可以将第一个文件作为任何源的任何可执行文件，并且作为第二个文件生成第二个可执行文件作为恶意软件。然后，可以找到要附加到两个文件的字节块，使得得到的文件具有相同的MD5哈希值。攻击者只需要获取第一个文件就可以了。

2016012030+Wang Chaochao+Application of Hash Function and Its Security

三.结合md5算法中的选择前缀碰撞以及第二个链接中的helloworld.exe和goodbyworld.exe两个可执行文件的md5消息摘要值和两个文件的执行结果说明md5算法来验证软件完整性时可能出现的问题。

Guess you like