1. 前言

SHA系列算法是一种密码散列函数，由美国国家安全局设计，并由美国国家标准技术研究所（NIST）发布为联邦数据处理标准（FIPS）。现在已经被破解。

我们本文主要研究SHA256算法。

2. 什么是SHA ?

SHA算法的名称是安全散列算法，英文名称是Secure Hash Algorithm。

SHA算法分为很多版本。可以分为SHA-1和SHA-2两大类。其中SHA-2的子版本包括SHA-224，SHA-256，SHA-384，SHA-512,其输出结果分别为224、256、384、512位。与之对应的MD5算法的输出只有128位。

3. SHA256算法的特点

SHA算法具有以下特点：

压缩性：任意长度的数据，算出的SHA256值长度都是固定的。
容易计算：从原数据计算出SHA256值很容易。
抗修改性：对原数据进行任何改动，哪怕只修改1个字节，所得到的SHA256值都有很大区别。
强抗碰撞：已知原数据和其SHA256值，想找到一个具有相同SHA256值的数据（即伪造数据）是非常困难的。

可以看到，这些特性和前面学习的MD5算法区块链中的密码学系列之MD5算法（二）十分的相似。除此之外SHA256相比于MD5算法还具有优势。我们来看：

2的128次方为340282366920938463463374607431768211456，也就是10的39次方级别
2的160次方为1.4615016373309029182036848327163e+48，也就是10的48次方级别
2的256次方为1.1579208923731619542357098500869 × 10的77次方，也就是10的77次方

宇宙中原子数大约在10的60次方到80次方之间，所以2的256次方有足够的空间容纳所有的可能，算法好的情况下冲突碰撞的概率很低。

此外，我们还可以将SHA256和MD5进行混合使用，我们可以取MD5摘要的一部分和SHA256摘要的一部分，拼成一个合成摘要。这样别人不知道规则，自然无从破解。

4. SHA256算法的底层原理

4.1 MD5的原理和SHA原理的对比

MD5把128bit的信息摘要分成A，B，C，D四段（Words），每段32bit，在循环过程中交替运算A，B，C，D，最终组成128bit的摘要结果。如下图所示：

SHA-1算法，核心过程大同小异，主要的不同点是把160bit的信息摘要分成了A，B，C，D，E五段。

SHA-2系列算法，核心过程更复杂一些，把信息摘要分成了A，B，C，D，E，F，G，H八段。

MD5算法大概可以分为以下四步：原文处理，设置初始值，循环加工，拼接结果。下面我们进行详细的分析。

4.2设置初始值

SHA256算法中用到了8个哈希初值以及64个哈希常量：

SHA256算法的8个哈希初值如下：

h0 := 0x6a09e667
h1 := 0xbb67ae85
h2 := 0x3c6ef372
h3 := 0xa54ff53a
h4 := 0x510e527f
h5 := 0x9b05688c
h6 := 0x1f83d9ab
h7 := 0x5be0cd19

这些初值是对自然数中前8个质数（2,3,5,7,11,13,17,19）的平方根的小数部分取前32bit而来。我们容易知道2的平方根为0.414213562373095048,小数部分转为16进制，即为0x6a09e667。

在SHA256算法中，用到的64个常量如下：

428a2f98 71374491 b5c0fbcf e9b5dba5
3956c25b 59f111f1 923f82a4 ab1c5ed5
d807aa98 12835b01 243185be 550c7dc3
72be5d74 80deb1fe 9bdc06a7 c19bf174
e49b69c1 efbe4786 0fc19dc6 240ca1cc
2de92c6f 4a7484aa 5cb0a9dc 76f988da
983e5152 a831c66d b00327c8 bf597fc7
c6e00bf3 d5a79147 06ca6351 14292967
27b70a85 2e1b2138 4d2c6dfc 53380d13
650a7354 766a0abb 81c2c92e 92722c85
a2bfe8a1 a81a664b c24b8b70 c76c51a3
d192e819 d6990624 f40e3585 106aa070
19a4c116 1e376c08 2748774c 34b0bcb5
391c0cb3 4ed8aa4a 5b9cca4f 682e6ff3
748f82ee 78a5636f 84c87814 8cc70208
90befffa a4506ceb bef9a3f7 c67178f2

和8个哈希初值类似，这些常量是对自然数中前64个质数(2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97…)的立方根的小数部分取前32bit而来。

4.3 数据预处理

由于用户输入的信息的长度不一致，所以我们需要对其进行处理，对其进行补位。和MD5补位的原理类似。

SHA256算法的原理是：用输入的长度对448取余，不足448的进行补位，填充的方法是第一位补1，其他位补0。经过补位后，现在的长度为512*N+448。事实上，补位之后的结果已经不能代表真正的字符串，所以我们还需要记录下原来字符串的长度，方法是用（512-448=64）来记录实际的长度。

经过上面的处理之后，我们处理之后的信息的长度为512*(N+1)。

4.4 循环加工

我们首先学习下预备知识，SHA256散列函数中涉及的操作全部是逻辑的位运算。

其中Sn表示循环右移n个bit，Rn表示右移n个bit。

首先：将消息分解成512-bit大小的块。

假设消息M可以被分解为n个块，于是整个算法需要做的就是完成n次迭代，n次迭代的结果就是最终的哈希值，即256bit的数字摘要。

一个256-bit的摘要的初始值H0，经过第一个数据块进行运算，得到H1，即完成了第一次迭代。H1经过第二个数据块得到H2，……，依次处理，最后得到Hn，Hn即为最终的256-bit消息摘要。

256位被分为8个小块，每个小块32位。此外，第一次迭代中，映射的初值设置为前面介绍的8个哈希初值，如下图所示：

下面来学习Map(H_{i-1}) = H_{i}的具体算法：

将每一块（512位）分解为16个32位的字，记为w[0], …, w[15]。

for i := 0; i < 16; i++ {
        j := i * 4
        w[i] = uint32(p[j])<<24 | uint32(p[j+1])<<16 | uint32(p[j+2])<<8 | uint32(p[j+3])
    }
    for i := 16; i < 64; i++ {
        v1 := w[i-2]
        t1 := (v1>>17 | v1<<(32-17)) ^ (v1>>19 | v1<<(32-19)) ^ (v1 >> 10)
        v2 := w[i-15]
        t2 := (v2>>7 | v2<<(32-7)) ^ (v2>>18 | v2<<(32-18)) ^ (v2 >> 3)
        w[i] = t1 + w[i-7] + t2 + w[i-16]
    }

然后是64步迭代运算：

for i := 0; i < 64; i++ {
        t1 := h + ((e>>6 | e<<(32-6)) ^ (e>>11 | e<<(32-11)) ^ (e>>25 | e<<(32-25))) + ((e & f) ^ (^e & g)) + _K[i] + w[i]

        t2 := ((a>>2 | a<<(32-2)) ^ (a>>13 | a<<(32-13)) ^ (a>>22 | a<<(32-22))) + ((a & b) ^ (a & c) ^ (b & c))

        h = g
        g = f
        f = e
        e = d + t1
        d = c
        c = b
        b = a
        a = t1 + t2
    }

    h0 += a
    h1 += b
    h2 += c
    h3 += d
    h4 += e
    h5 += f
    h6 += g
    h7 += h

生成256-bit的报文摘要 所有的512-bit分组处理完毕后，对于SHA-256算法最后一个分组产生的输出便是256-bit的报文摘要。

dig.h[0], dig.h[1], dig.h[2], dig.h[3], dig.h[4], dig.h[5], dig.h[6], dig.h[7] = h0, h1, h2, h3, h4, h5, h6, h7

5. java实现SHA256算法

import java.nio.ByteBuffer;


/**
 * Offers a {@code hash(byte[])} method for hashing messages with SHA-256.
 */
public class Sha256
{
    private static final int[] K = { 0x428a2f98, 0x71374491, 0xb5c0fbcf,
            0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
            0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74,
            0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786,
            0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc,
            0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7,
            0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85,
            0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb,
            0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70,
            0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
            0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3,
            0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f,
            0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7,
            0xc67178f2 };

    private static final int[] H0 = { 0x6a09e667, 0xbb67ae85, 0x3c6ef372,
            0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19 };

    // working arrays
    private static final int[] W = new int[64];
    private static final int[] H = new int[8];
    private static final int[] TEMP = new int[8];

    /**
     * Hashes the given message with SHA-256 and returns the hash.
     *
     * @param message The bytes to hash.
     * @return The hash's bytes.
     */
    public static byte[] hash(byte[] message)
    {
        // let H = H0
        System.arraycopy(H0, 0, H, 0, H0.length);

        // initialize all words
        int[] words = toIntArray(pad(message));

        // enumerate all blocks (each containing 16 words)
        for (int i = 0, n = words.length / 16; i < n; ++i) {

            // initialize W from the block's words
            System.arraycopy(words, i * 16, W, 0, 16);
            for (int t = 16; t < W.length; ++t) {
                W[t] = smallSig1(W[t - 2]) + W[t - 7] + smallSig0(W[t - 15])
                        + W[t - 16];
            }

            // let TEMP = H
            System.arraycopy(H, 0, TEMP, 0, H.length);

            // operate on TEMP
            for (int t = 0; t < W.length; ++t) {
                int t1 = TEMP[7] + bigSig1(TEMP[4])
                        + ch(TEMP[4], TEMP[5], TEMP[6]) + K[t] + W[t];
                int t2 = bigSig0(TEMP[0]) + maj(TEMP[0], TEMP[1], TEMP[2]);
                System.arraycopy(TEMP, 0, TEMP, 1, TEMP.length - 1);
                TEMP[4] += t1;
                TEMP[0] = t1 + t2;
            }

            // add values in TEMP to values in H
            for (int t = 0; t < H.length; ++t) {
                H[t] += TEMP[t];
            }

        }

        return toByteArray(H);
    }

    /**
     * Internal method, no need to call. Pads the given message to have a length
     * that is a multiple of 512 bits (64 bytes), including the addition of a
     * 1-bit, k 0-bits, and the message length as a 64-bit integer.
     *
     * @param message The message to pad.
     * @return A new array with the padded message bytes.
     */
    public static byte[] pad(byte[] message)
    {
        final int blockBits = 512;
        final int blockBytes = blockBits / 8;

        // new message length: original + 1-bit and padding + 8-byte length
        int newMessageLength = message.length + 1 + 8;
        int padBytes = blockBytes - (newMessageLength % blockBytes);
        newMessageLength += padBytes;

        // copy message to extended array
        final byte[] paddedMessage = new byte[newMessageLength];
        System.arraycopy(message, 0, paddedMessage, 0, message.length);

        // write 1-bit
        paddedMessage[message.length] = (byte) 0b10000000;

        // skip padBytes many bytes (they are already 0)

        // write 8-byte integer describing the original message length
        int lenPos = message.length + 1 + padBytes;
        ByteBuffer.wrap(paddedMessage, lenPos, 8).putLong(message.length * 8);

        return paddedMessage;
    }

    /**
     * Converts the given byte array into an int array via big-endian conversion
     * (4 bytes become 1 int).
     *
     * @param bytes The source array.
     * @return The converted array.
     */
    public static int[] toIntArray(byte[] bytes)
    {
        if (bytes.length % Integer.BYTES != 0) {
            throw new IllegalArgumentException("byte array length");
        }

        ByteBuffer buf = ByteBuffer.wrap(bytes);

        int[] result = new int[bytes.length / Integer.BYTES];
        for (int i = 0; i < result.length; ++i) {
            result[i] = buf.getInt();
        }

        return result;
    }

    /**
     * Converts the given int array into a byte array via big-endian conversion
     * (1 int becomes 4 bytes).
     *
     * @param ints The source array.
     * @return The converted array.
     */
    public static byte[] toByteArray(int[] ints)
    {
        ByteBuffer buf = ByteBuffer.allocate(ints.length * Integer.BYTES);
        for (int i = 0; i < ints.length; ++i) {
            buf.putInt(ints[i]);
        }

        return buf.array();
    }

    private static int ch(int x, int y, int z)
{
        return (x & y) | ((~x) & z);
    }

    private static int maj(int x, int y, int z)
{
        return (x & y) | (x & z) | (y & z);
    }

    private static int bigSig0(int x)
{
        return Integer.rotateRight(x, 2) ^ Integer.rotateRight(x, 13)
                ^ Integer.rotateRight(x, 22);
    }

    private static int bigSig1(int x)
{
        return Integer.rotateRight(x, 6) ^ Integer.rotateRight(x, 11)
                ^ Integer.rotateRight(x, 25);
    }

    private static int smallSig0(int x)
{
        return Integer.rotateRight(x, 7) ^ Integer.rotateRight(x, 18)
                ^ (x >>> 3);
    }

    private static int smallSig1(int x)
{
        return Integer.rotateRight(x, 17) ^ Integer.rotateRight(x, 19)
                ^ (x >>> 10);
    }
}

6. 总结

经过上面的学习，我们已经详细的了解了SHA256算法的历史，特点，应用，实现原理。其中，实现原理是十分重要的。实现原理部分其实和MD5十分的相似。在我们的学习过程中，我们可以和MD5算法进行类比学习。其实这两类算法都属于Hash算法。

最后，我们用java代码实现了SHA256加密算法。

源代码：

https://github.com/Anapodoton/Encryption/blob/master/hash/SHA256/java/Sha256.java

区块链中的密码学系列之SHA256算法（三）