Halo2 学习笔记——背景资料之Polynomials(3)

1. 引言

A ( X ) A(X) A(X)为基于 F p \mathbb{F}_p Fp的degree-3多项式:
A ( X ) = a 0 + a 1 X + a 2 X 2 + a 3 X 3 A(X)=a_0+a_1X+a_2X^2+a_3X^3 A(X)=a0+a1X+a2X2+a3X3
其中 a 0 a_0 a0称为constant term。

degree n − 1 n-1 n1的多项式,其有 n n n个系数。

通常,将变量 X X X替换为具体的值 x x x,来计算相应的结果 A ( x ) A(x) A(x)——数学上称为“evaluating A ( X ) A(X) A(X) at a point x x x”。(注意,此处的point不要与椭圆曲线上的point混淆。)

多项式的主要属性有:

  • d e g ( A ( X ) B ( X ) ) = d e g ( A ( X ) ) + d e g ( B ( X ) ) , d e g ( A ( X ) / B ( X ) ) = d e g ( A ( X ) ) − d e g ( B ( X ) ) deg(A(X)B(X))=deg(A(X))+deg(B(X)), deg(A(X)/B(X))=deg(A(X))-deg(B(X)) deg(A(X)B(X))=deg(A(X))+deg(B(X)),deg(A(X)/B(X))=deg(A(X))deg(B(X))
  • A ( X ) A(X) A(X)的degree为 n − 1 n-1 n1,则对该多项式的 n n n个不同点进行evaluation,这些evaluation值可完全定义该多项式。换句话说,由这 n n n个不同点的evaluation值,通过多项式插值获得唯一的degree为 n − 1 n-1 n1的多项式 A ( X ) A(X) A(X)
  • 多项式 A ( X ) A(X) A(X)的系数表示法为: [ a 0 , a 1 , ⋯   , a n − 1 ] [a_0,a_1,\cdots,a_{n-1}] [a0,a1,,an1],其点值表示法为:
    [ ( x 0 , A ( x 0 ) ) , ( x 1 , A ( x 1 ) ) , ⋯   , ( x n − 1 , A ( x n − 1 ) ) ] [(x_0,A(x_0)), (x_1,A(x_1)), \cdots, (x_{n-1},A(x_{n-1}))] [(x0,A(x0)),(x1,A(x1)),,(xn1,A(xn1))]
    其中 x 0 , x 1 , ⋯   , x n − 1 x_0,x_1,\cdots,x_{n-1} x0,x1,,xn1 n n n个不同的值。
    这两种方式都可唯一定义同一多项式。

2. Horner’s rule

可借助Horner’s rule来对一个 n − 1 n-1 n1 degree多项式进行高效evaluation,仅需 n − 1 n-1 n1个乘法运算 + n − 1 n-1 n1个加法运算:
a 0 + a 1 X + a 2 X 2 + ⋯ + a n − 1 X n − 1 = a 0 + X ( a 1 + X ( a 2 + ⋯ + X ( a n − 2 + X a n − 1 ) ) ) a_0+a_1X+a_2X^2+\cdots+a_{n-1}X^{n-1}=a_0+X(a_1+X(a_2+\cdots+X(a_{n-2}+Xa_{n-1}))) a0+a1X+a2X2++an1Xn1=a0+X(a1+X(a2++X(an2+Xan1)))

3. FFT

FFT可将多项式的系数表示法 和 点值表示法 之间高效相互转换。

点值表示法时,evaluate的point为 n n n-th roots of unity { w 0 , w 1 , ⋯   , w n − 1 } \{w^0,w^1,\cdots,w^{n-1}\} { w0,w1,,wn1},其中 w w w为a primitive n n n-th root of unity。

根据roots of unity的对城乡,FFT每轮运算将reduce the evaluation into a problem only half the size。因此,通常使用的多项式长度为some power of two,即 n = 2 k n=2^k n=2k,从而可apply the halving reduction recursively。

3.1 将FFT用于快速多项式乘法

如需计算 A ( X ) ⋅ B ( X ) = C ( X ) A(X)\cdot B(X)=C(X) A(X)B(X)=C(X),若采用系数表示法时,需要进行 O ( n 2 ) O(n^2) O(n2)次运算:
A ( X ) = a 0 + a 1 X + a 2 X 2 + ⋯ + a n − 1 X n − 1 A(X)=a_0+a_1X+a_2X^2+\cdots+a_{n-1}X^{n-1} A(X)=a0+a1X+a2X2++an1Xn1
B ( X ) = b 0 + b 1 X + b 2 X 2 + ⋯ + b n − 1 X n − 1 B(X)=b_0+b_1X+b_2X^2+\cdots+b_{n-1}X^{n-1} B(X)=b0+b1X+b2X2++bn1Xn1
C ( X ) = A ( X ) ⋅ B ( X ) = a 0 ⋅ ( b 0 + b 1 X + b 2 X 2 + ⋯ + b n − 1 X n − 1 ) + a 1 X ⋅ ( b 0 + b 1 X + b 2 X 2 + ⋯ + b n − 1 X n − 1 ) + ⋯ + a n − 1 X n − 1 ⋅ ( b 0 + b 1 X + b 2 X 2 + ⋯ + b n − 1 X n − 1 ) C(X)=A(X)\cdot B(X)=a_0\cdot (b_0+b_1X+b_2X^2+\cdots+b_{n-1}X^{n-1})\\+ a_1X\cdot (b_0+b_1X+b_2X^2+\cdots+b_{n-1}X^{n-1})\\ +\cdots \\+ a_{n-1}X^{n-1}\cdot (b_0+b_1X+b_2X^2+\cdots+b_{n-1}X^{n-1}) C(X)=A(X)B(X)=a0(b0+b1X+b2X2++bn1Xn1)+a1X(b0+b1X+b2X2++bn1Xn1)++an1Xn1(b0+b1X+b2X2++bn1Xn1)

若采用点值表示法,则多项式乘法仅需 O ( n ) O(n) O(n)次运算:
A : { ( x 0 , A ( x 0 ) ) , ( x 1 , A ( x 1 ) ) , ⋯   , ( x n − 1 , A ( x n − 1 ) ) } A:\{(x_0,A(x_0)), (x_1,A(x_1)),\cdots,(x_{n-1}, A(x_{n-1}))\} A{ (x0,A(x0)),(x1,A(x1)),,(xn1,A(xn1))}
B : { ( x 0 , B ( x 0 ) ) , ( x 1 , B ( x 1 ) ) , ⋯   , ( x n − 1 , B ( x n − 1 ) ) } B:\{(x_0,B(x_0)), (x_1,B(x_1)),\cdots,(x_{n-1}, B(x_{n-1}))\} B{ (x0,B(x0)),(x1,B(x1)),,(xn1,B(xn1))}
C : { ( x 0 , A ( x 0 ) B ( x 0 ) ) , ( x 1 , A ( x 1 ) B ( x 1 ) ) , ⋯   , ( x n − 1 , A ( x n − 1 ) B ( x n − 1 ) ) } C:\{(x_0,A(x_0)B(x_0)), (x_1,A(x_1)B(x_1)),\cdots,(x_{n-1}, A(x_{n-1})B(x_{n-1}))\} C{ (x0,A(x0)B(x0)),(x1,A(x1)B(x1)),,(xn1,A(xn1)B(xn1))}
其中 C C C中的值是直接multiplied pointwise的。

进行快速多项式乘法运算的主要步骤为:

  • 1)Evaluate polynomials at all n n n points。
  • 2)Perform fast pointwise multiplication in the evaluation representation O ( n ) O(n) O(n)
  • 3)将结果再转换为系数表示法(通过IFFT)。

主要难点在于高效的对多项式进行evaluate和插值。
直观地,evaluate a polynomial at n n n points需要 O ( n 2 ) O(n^2) O(n2)次运算(每个point采用Horner’s rule需 O ( n ) O(n) O(n)次运算, n n n个point对应为 O ( n 2 ) O(n^2) O(n2)次运算):
[ A ( 1 ) A ( w ) A ( w 2 ) ⋮ A ( w n − 1 ) ] = [ 1 1 1 ⋯ 1 1 w w 2 ⋯ w n − 1 1 w 2 w 2 ⋅ 2 ⋯ w 2 ⋅ ( n − 1 ) ⋮ ⋮ ⋮ ⋮ 1 w n − 1 w 2 ( n − 1 ) ⋯ w ( n − 1 ) 2 ] ⋅ [ a 0 a 1 a 2 ⋮ a n − 1 ] \begin{bmatrix} A(1) \\ A(w) \\ A(w^2) \\ \vdots \\ A(w^{n-1}) \end{bmatrix} = \begin{bmatrix} 1& 1 & 1 & \cdots & 1\\ 1& w & w^2 & \cdots & w^{n-1} \\ 1& w^2 & w^{2\cdot 2} & \cdots & w^{2\cdot (n-1)} \\ \vdots & \vdots & \vdots & &\vdots \\ 1 & w^{n-1} & w^{2(n-1)} & \cdots & w^{(n-1)^2} \end{bmatrix} \cdot \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_{n-1} \end{bmatrix} A(1)A(w)A(w2)A(wn1)=11111ww2wn11w2w22w2(n1)1wn1w2(n1)w(n1)2a0a1a2an1

可将以上多项式表示为:
A ^ = V w ⋅ A \hat{\mathbf{A}}=\mathbf{V}_w\cdot \mathbf{A} A^=VwA
其中 A ^ \hat{\mathbf{A}} A^可称为 A \mathbf{A} A的离散傅里叶变换(DFT), V w \mathbf{V}_w Vw称为Vandermonde matrix。

3.2 The (radix-2) Cooley-Tukey algorithm

将size为 n n n的DFT分为2个交错的size为 n / 2 n/2 n/2的DFT。
对于多项式 A ( X ) = a 0 + a 1 X + a 2 X 2 + ⋯ + a n − 1 X n − 1 A(X)=a_0+a_1X+a_2X^2+\cdots+a_{n-1}X^{n-1} A(X)=a0+a1X+a2X2++an1Xn1,可将其分为奇数项和偶数项:
A e v e n = a 0 + a 2 X + ⋯ + a n − 2 X n / 2 − 1 A_{even}=a_0+a_2X+\cdots+a_{n-2}X^{n/2-1} Aeven=a0+a2X++an2Xn/21
A o d d = a 1 + a 3 X + ⋯ + a n − 1 X n / 2 − 1 A_{odd}=a_1+a_3X+\cdots +a_{n-1}X^{n/2-1} Aodd=a1+a3X++an1Xn/21
然后基于此,有 A ( X ) = A e v e n ( X 2 ) + X A o d d ( X 2 ) A(X)=A_{even}(X^2)+XA_{odd}(X^2) A(X)=Aeven(X2)+XAodd(X2)

A ( X ) A(X) A(X) evaluate at points w n i , w n n 2 + i w_n^i, w_n^{\frac{n}{2}+i} wni,wn2n+i,其中 i ∈ [ 0 , ⋯   , n 2 − 1 ] i\in[0,\cdots,\frac{n}{2}-1] i[0,,2n1],有:
A ( w n i ) = A e v e n ( ( w n i ) 2 ) + w n i A o d d ( ( w n i ) 2 ) A(w_n^i)=A_{even}((w_n^i)^2)+w_n^iA_{odd}((w_n^i)^2) A(wni)=Aeven((wni)2)+wniAodd((wni)2)
A ( w n n 2 + i ) = A e v e n ( ( w n n 2 + i ) 2 ) + w n n 2 + i A o d d ( ( w n n 2 + i ) 2 ) = A e v e n ( ( − w n i ) 2 ) − w n i A o d d ( ( − w n i ) 2 ) 【 ← ( 根 据 n e g a t i o n l e m m a ) 】 = A e v e n ( ( w n i ) 2 ) − w n i A o d d ( ( w n i ) 2 ) A(w_n^{\frac{n}{2}+i})=A_{even}((w_n^{\frac{n}{2}+i})^2)+w_n^{\frac{n}{2}+i}A_{odd}((w_n^{\frac{n}{2}+i})^2)\\ =A_{even}((-w_n^i)^2)-w_n^iA_{odd}((-w_n^i)^2) 【\leftarrow (根据negation lemma)】\\ =A_{even}((w_n^i)^2)-w_n^iA_{odd}((w_n^i)^2) A(wn2n+i)=Aeven((wn2n+i)2)+wn2n+iAodd((wn2n+i)2)=Aeven((wni)2)wniAodd((wni)2)(negationlemma)=Aeven((wni)2)wniAodd((wni)2)

由上可看出,两者存在一定的对称性。仅需evaluate A e v e n A_{even} Aeven A o d d A_{odd} Aodd over half the domain ( w n 0 ) 2 , ( w n ) 2 , ⋯   , ( w n n 2 − 1 ) 2 (w_n^0)^2,(w_n)^2,\cdots,(w_n^{\frac{n}{2}-1})^2 (wn0)2,(wn)2,,(wn2n1)2,其中 i = [ 0 , ⋯   , n 2 − 1 ] i=[0,\cdots, \frac{n}{2}-1] i=[0,,2n1](根据halving lemma)。
即意味着,可将length- n n n DFT 转换为 2个length- n 2 \frac{n}{2} 2n DFTs。

n = 2 k n=2^k n=2k时(为power of two,若不够可zero-padding),然后递归使用divide-and-conquer策略。根据Master Theorem,相应的evaluation算法仅需 O ( n log ⁡ 2 n ) O(n\log_2n) O(nlog2n)次运算,也称为Fast Fourier Transform (FFT)。

3.3 Inverse FFT

至此,已完成polynomial evaluation 和 multiply pointwise,现在需要将 C ( X ) = A ( X ) ⋅ B ( X ) C(X)=A(X)\cdot B(X) C(X)=A(X)B(X)由 点值表示法 转换为 系数表示法。仅需,在点值表示法 上进行FFT运算:

  • 1)将Vandermonde matrix中的 w i w^i wi替换为 w − i w^{-i} wi
  • 2)将最终结果乘以 1 / n 1/n 1/n

即:
A = 1 n V w − 1 ⋅ A ^ \mathbf{A}=\frac{1}{n}\mathbf{V}_{w^{-1}}\cdot \hat{\mathbf{A}} A=n1Vw1A^

IFFT与FFT具有类似的格式,详细参见 The Fast Fourier Transform and Polynomial Multiplication

4. The Schwartz-Zippel lemma

The Schwartz-Zippel lemma表达的是“different polynomials are different at most points”:
p ( x 1 , x 2 , ⋯   , x n ) p(x_1,x_2,\cdots,x_n) p(x1,x2,,xn)为nonzero polynomial of n n n variables with degree d d d,令 S S S为a finite set of numbers with at least d d d elements in it。若从 S S S中随机选出 α 1 , α 2 , ⋯   , α n \alpha_1,\alpha_2,\cdots,\alpha_n α1,α2,,αn,则有:
Pr [ p ( α 1 , α 2 , ⋯   , α n ) = 0 ] ≤ d ∣ S ∣ \text{Pr}[p(\alpha_1,\alpha_2,\cdots,\alpha_n)=0]\leq \frac{d}{|S|} Pr[p(α1,α2,,αn)=0]Sd

对于degree为 d d d的单变量非零多项式 p ( X ) p(X) p(X),即意味着最多有 d d d个根。

The Schwartz-Zippel lemma用于polynomial equality testing:
已知2个多变量多项式, p 1 ( x 1 , ⋯   , x n ) p_1(x_1,\cdots, x_n) p1(x1,,xn)的degree为 d 1 d_1 d1 p 2 ( x 1 , ⋯   , x n ) p_2(x_1,\cdots,x_n) p2(x1,,xn)的degree为 d 2 d_2 d2,可随机选择 α 1 , ⋯   , α n ← S \alpha_1,\cdots,\alpha_n\leftarrow S α1,,αnS,其中 ∣ S ∣ ≥ ( d 1 + d 2 ) |S|\geq (d_1+d_2) S(d1+d2),测试 p 1 ( α 1 , ⋯   , α n ) − p 2 ( α 1 , ⋯   , α n ) = 0 p_1(\alpha_1,\cdots,\alpha_n)-p_2(\alpha_1,\cdots,\alpha_n)=0 p1(α1,,αn)p2(α1,,αn)=0是否成立。若 p 1 , p 2 p_1,p_2 p1,p2这两个多项式完全相同,必然成立。若两者不同,则该等式成立的概率不高于 m a x ( d 1 , d 2 ) ∣ S ∣ \frac{max(d_1,d_2)}{|S|} Smax(d1,d2)

5. Vanishing polynomial

对于order n n n multiplicative subgroup H \mathcal{H} H with primitive root of unity w w w,对于所有的 w i ∈ H , i ∈ [ n − 1 ] w^i\in\mathcal{H},i\in[n-1] wiH,i[n1],有 ( w i ) n = ( w n ) i = ( w 0 ) i = 1 (w^i)^n=(w^n)^i=(w^0)^i=1 (wi)n=(wn)i=(w0)i=1,换句话说:
Z H ( X ) = X n − 1 = ( X − w 0 ) ( X − w 1 ) ( X − w 2 ⋯ ( X − w n − 1 ) Z_H(X)=X^n-1=(X-w^0)(X-w^1)(X-w^2\cdots(X-w^{n-1}) ZH(X)=Xn1=(Xw0)(Xw1)(Xw2(Xwn1)
H \mathcal{H} H中的每个元素为 Z H ( X ) Z_H(X) ZH(X)的根。称 Z H ( X ) Z_H(X) ZH(X)为the vanishing polynomial over H \mathcal{H} H,因为其evaluate to zero on all elements of H \mathcal{H} H

vanishing polynomial 用于验证polynomial constraints时将特别实用。如,为了验证 A ( X ) + B ( X ) = C ( X ) A(X)+B(X)=C(X) A(X)+B(X)=C(X) over H \mathcal{H} H,可改为验证 A ( X ) + B ( X ) − C ( X ) A(X)+B(X)-C(X) A(X)+B(X)C(X)为 some multiple of Z H ( X ) Z_H(X) ZH(X)。换句话说,若将constraints除以vanishing polynomial,仍然可产生某多项式 A ( X ) + B ( X ) − C ( X ) Z H ( X ) = H ( X ) \frac{A(X)+B(X)-C(X)}{Z_H(X)}=H(X) ZH(X)A(X)+B(X)C(X)=H(X),则可说明 A ( X ) + B ( X ) − C ( X ) = 0 A(X)+B(X)-C(X)=0 A(X)+B(X)C(X)=0 over H \mathcal{H} H

6. Lagrange basis functions

多项式通常以monomial basis(如 X , X 2 , ⋯   , X n X,X^2,\cdots,X^n X,X2,,Xn)来表示,但是,当机遇multiplicative subgroup of order n n n时,采用Lagrange basis表示更自然。

对于order- n n n multiplicative subgroup H \mathcal{H} H with primitive root of unity w w w,该subgroup的Lagrange basis为a set of functions { L i } i = 0 n − 1 \{\mathcal{L}_i\}_{i=0}^{n-1} { Li}i=0n1,其中:
L i ( w j ) = { 1 if  i = j , 0 otherwise. \mathcal{L}_i(w^j)=\left\{\begin{matrix} 1 & \text{if } i=j,\\ 0 & \text{otherwise.} \end{matrix}\right. Li(wj)={ 10if i=j,otherwise.

可将 L i ( w j ) \mathcal{L}_i(w^j) Li(wj)更精简的表示为 δ i j \delta_{ij} δij,其中 δ \delta δ为the Kronecker delta function。

至此,可将多项式表示为a linear combination of Lagrange basis functions:
A ( X ) = ∑ i = 0 n − 1 a i L i ( X ) , X ∈ H A(X)=\sum_{i=0}^{n-1}a_i\mathcal{L}_i(X), X\in\mathcal{H} A(X)=i=0n1aiLi(X),XH
即等价为说, p ( X ) p(X) p(X) evaluates to a 0 a_0 a0 at w 0 w^0 w0, to a 1 a_1 a1 at w 1 w^1 w1, to a 2 a_2 a2 at w 2 w^2 w2等等。

当基于multiplicative subgroup时,Lagrange basis function具有很方便的sparse表示形式:
L i ( X ) = c i ⋅ ( X n − 1 ) X − w i \mathcal{L}_i(X)=\frac{c_i\cdot (X^n-1)}{X-w^i} Li(X)=Xwici(Xn1)
其中 c i c_i ci为barycentric weight。详细可参看Barycentric Lagrange Interpolation*
i = 0 i=0 i=0时,有 c = 1 / n ⇒ L 0 ( X ) = 1 n ( X n − 1 ) X − 1 c=1/n\Rightarrow \mathcal{L}_0(X)=\frac{1}{n}\frac{(X^n-1)}{X-1} c=1/nL0(X)=n1X1(Xn1)

对于evaluation point set { x 0 , x 1 , ⋯   , x n − 1 } \{x_0,x_1,\cdots,x_{n-1}\} { x0,x1,,xn1},若不假设 x i x_i xi为multiplicative subgroup,仍可采用Lagrange 多项式 L i \mathcal{L}_i Li来表示:
L i ( X ) = ∏ j ≠ i X − x j x i − x j , i ∈ [ 0.. n − 1 ] \mathcal{L}_i(X)=\prod_{j\neq i}\frac{X-x_j}{x_i-x_j}, i\in[0..n-1] Li(X)=j=ixixjXxj,i[0..n1]

对于每个 X = x j ≠ x i X=x_j\neq x_i X=xj=xi,都有零分子项 ( x j − x j ) (x_j-x_j) (xjxj),从而使整个值为0。若 X = x i X=x_i X=xi,有 x i − x j x i − x j \frac{x_i-x_j}{x_i-x_j} xixjxixj,结果为1。从而可实现desired Kronecker delta behaviour L i ( x j ) = δ i j \mathcal{L}_i(x_j)=\delta_{ij} Li(xj)=δij on the set { x 0 , x 1 , ⋯   , x n − 1 } \{x_0,x_1,\cdots,x_{n-1}\} { x0,x1,,xn1}

6.1 Lagrange interpolation

已知多项式的点值表示:
A : { ( x 0 , A ( x 0 ) ) , ( x 1 , A ( x 1 ) ) , ⋯   , ( x n − 1 , A ( x n − 1 ) ) } A: \{(x_0,A(x_0)), (x_1,A(x_1)), \cdots, (x_{n-1},A(x_{n-1}))\} A:{ (x0,A(x0)),(x1,A(x1)),,(xn1,A(xn1))}

可基于Lagrange basis来构建其 系数表示:
A ( X ) = ∑ i = 0 n − 1 A ( x i ) L i ( X ) A(X)=\sum_{i=0}^{n-1}A(x_i)\mathcal{L}_i(X) A(X)=i=0n1A(xi)Li(X)
其中 X ∈ { x 0 , x 1 , ⋯   , x n − 1 } X\in\{x_0,x_1,\cdots,x_{n-1}\} X{ x0,x1,,xn1}

参考资料

[1] Halo2 背景资料之Polynomials

猜你喜欢

转载自blog.csdn.net/mutourend/article/details/120366466