KCF算法公式推导

1 最小二乘法求解矩阵形式推导

  • 设训练样本集为 ( x i , y i ) (x_i,y_i) 一元(向量)线性回归可表示为: f ( x i ) = w T x i + b f(x_i)=w^T\vec{x_i}+b
  • 若把样本输入 x i \vec{x_i} 表示成矩阵形式(设有n个样本输入,每个输入有d个特性),有: X = [ x 11 x 12 . . . x 1 d 1 x 21 x 22 . . . x 2 d 1 . . . . . . . . . . . . . . . x n 1 x n 2 . . . x n d 1 ] = [ x 1 T 1 x 2 T 1 . . . . . . x n T 1 ] \boldsymbol X=\begin{bmatrix}x_{11} &x_{12}&...&x_{1d}&1\\x_{21} &x_{22}&...&x_{2d}&1\\...&...&...&...&...\\x_{n1} &x_{n2}&...&x_{nd}&1\end{bmatrix}=\begin{bmatrix}{x_1^T}&&1\\{x_2^T}&&1\\...&&...\\{x_n^T}&&1\end{bmatrix} 其中1表示偏置 b b w ~ = ( w ; b ) {\widetilde w}=(\vec w;b)
  • 多元线性回归可表示为: y = X w ~ \vec y=\boldsymbol X{\widetilde w} 其中 y = ( y 1 , y 2 , . . . , y n ) \vec y=(y_1,y_2,...,y_n) 表示样本标签
  • 最小二乘法可表示为 m i n w y X w ~ 2 2 = m i n w ( y X w ~ ) T ( y X w ~ ) {\underset {w}{min}||\vec y - \boldsymbol X{\widetilde w}||_2}^2=\underset{w}{min}(\vec y - \boldsymbol X{\widetilde w})^T(\vec y - \boldsymbol X{\widetilde w}) L w = 1 2 ( y X w ~ ) T ( y X w ~ ) = 1 2 ( y T w ~ T X T ) ( y X w ~ ) = 1 2 ( y T y y T X w ~ w ~ T X T y + w ~ T X T X w ~ ) \begin{aligned} L_w & =\frac{1}{2}(\vec y - \boldsymbol X{\widetilde w})^T(\vec y - \boldsymbol X{\widetilde w})\\ &=\frac{1}{2}({\vec y} ^T-{\widetilde w}^T{\boldsymbol X}^T)(\vec y - \boldsymbol X{\widetilde w})\\ &=\frac{1}{2}({\vec y} ^T{\vec y} -{\vec y} ^T{\boldsymbol X}{\widetilde w}-{\widetilde w}^T{\boldsymbol X}^T{\vec y} + {\widetilde w}^T{\boldsymbol X}^T{\boldsymbol X}{\widetilde w}) \end{aligned} L w w = 1 2 [ ( y T X ) T X T y + X T X w ~ + w ~ T X T X ) T ] = 1 2 ( 2 X T y + 2 X T X w ~ ) \begin{aligned} \frac {\partial L_w} {\partial w} &=\frac{1}{2}[-(\vec y ^ T\boldsymbol X )^T-\boldsymbol X ^ T \vec y + \boldsymbol X ^ T \boldsymbol X {\widetilde w} + {\widetilde w}^T\boldsymbol X ^ T \boldsymbol X )^T] \\ &=\frac{1}{2}(-2\boldsymbol X^T \vec y+2\boldsymbol X ^T \boldsymbol X {\widetilde w} ) \end{aligned} 注: ( X θ ) θ = X T \frac{\partial (\boldsymbol X \theta)}{\partial \theta}=\boldsymbol X ^ T ( θ T X ) θ T = X T \frac{\partial (\theta ^ T\boldsymbol X )}{\partial \theta ^ T}=\boldsymbol X ^ T ( θ T X ) θ = X \frac{\partial (\theta ^T \boldsymbol X)}{{\partial \theta }}=\boldsymbol X 即:若上下向量一样,则结果为矩阵的转置,若互为转置,则结果为原矩阵
    L w w = 0 \frac {\partial L_w} {\partial w}=0 X T y = 2 X T X w ~ \Rightarrow \boldsymbol X^T \vec y=2\boldsymbol X ^T \boldsymbol X {\widetilde w} w ~ = ( X T X ) 1 X T y \Rightarrow {\widetilde w} = (\boldsymbol X^T \boldsymbol X)^{-1}\boldsymbol X^T \vec y

2 岭回归求解公式

  • 当数据特征较样本数多时,即样本数不足(d>n:未知数个数大于方程个数),输入数据不是满秩矩阵,这将导致非满秩矩阵 X T X \boldsymbol X^T \boldsymbol X 在求逆时会发生问题。岭回归是在 X T X \boldsymbol X^T \boldsymbol X 上加一个正则项 λ I \lambda \boldsymbol I 从而使矩阵非奇异,进而能对 X T X + λ I \boldsymbol X^T \boldsymbol X+\lambda \boldsymbol I 求逆,即: w ~ = ( X T X + λ I ) 1 X T y \Rightarrow {\widetilde w} = (\boldsymbol X^T \boldsymbol X+\lambda \boldsymbol I)^{-1}\boldsymbol X^T \vec y λ \lambda 是正则化系数,可提供分类器泛化性,防止过拟合。当 λ \lambda 较小时,系数与普通矩阵一样,而较大时,使得求解的参数都接近于0
    注:岭回归的最小二乘法可表示为: m i n w ( y X w ~ 2 2 + λ w ~ 2 2 ) {\underset {w}{min}(||\vec y - \boldsymbol X{\widetilde w}||_2}^2+{\lambda||\widetilde w||_2}^2)
  • 由于KCF算法是在傅里叶域内计算,牵涉到复数矩阵,所以我们将结果都统一写成复数域中形式 (1) w ~ = ( X H X + λ I ) 1 X H y \Rightarrow {\widetilde w} = (\boldsymbol X^H \boldsymbol X+\lambda \boldsymbol I)^{-1}\boldsymbol X^H \vec y \tag{1}
    其中 X \boldsymbol X 表示由基样本生成的循环矩阵 X H \boldsymbol X ^H 表示其复数的共轭转置

3 循环矩阵在傅氏空间对角化

  • 循环矩阵公式表达及直观表示 X = C ( x ) = [ x 1 x 2 x 3 . . . x n x n x 1 x 2 . . . x n 1 x n 1 x n x 1 . . . x n 2 . . . . . . . . . . . . . . . x 2 x 3 x 4 . . . x 1 ] \boldsymbol X=\boldsymbol C(\vec x)=\begin{bmatrix} x_1& x_2 & x_3&...&x_n \\ x_n&x_1& x_2 &...&x_{n-1}\\ x_{n-1}&x_{n}& x_1 &...&x_{n-2}\\ ...&...&...&...&...&\\ x_2 & x_3&x_4&...&x_1\\ \end{bmatrix}
    在这里插入图片描述
  • 任何循环矩阵可以被傅里叶变换矩阵对角化,即 (2) X = C ( x ) = F d i a g ( x ^ ) F H \boldsymbol X=\boldsymbol C(x)=\boldsymbol F diag(\widehat x)\boldsymbol F ^ H \tag{2} 其中 x ^ X \widehat x由\boldsymbol X 的第1行元素(即基样本)经傅里叶变换后得到, F \boldsymbol F 是离散傅里叶变换矩阵,是一个常量 F = 1 n [ 1 1 . . . 1 1 1 ω . . . ω n 2 ω n 1 1 ω 2 . . . ω 2 ( n 2 ) ω 2 ( n 1 ) . . . . . . . . . . . . . . . 1 ω n 1 . . . ω ( n 1 ) ( n 2 ) ω ( n 1 ) 2 ] \boldsymbol F=\frac {1}{\sqrt n}\begin{bmatrix}1&1&...&1&1\\1& \omega &...&\omega ^{n-2}&\omega ^{n-1}\\1&\omega ^ 2 &...&\omega ^{2(n-2)}&\omega ^{2(n-1)}\\...&...&...&...&...\\1&\omega ^{n-1}&...&\omega ^{(n-1)(n-2)}&\omega ^{(n-1)^2}\end{bmatrix}
  • 要计算 ( 1 ) (1) 式,先求 X H X \boldsymbol X^H \boldsymbol X ,将 ( 2 ) (2) 式代可得 (3) X H X = ( F d i a g ( x ^ ) F H ) H F d i a g ( x ^ ) F H = ( F H ) H d i a g ( x ^ ) H F H F d i a g ( x ^ ) F H = F d i a g ( x ^ ) F H F d i a g ( x ^ ) F H = F d i a g ( x ^ x ^ ) F H \begin{aligned} \boldsymbol X^H \boldsymbol X & =(\boldsymbol F diag(\widehat x)\boldsymbol F^H)^H \boldsymbol F diag(\widehat x)\boldsymbol F^H\\ &=(\boldsymbol F^H)^H diag(\widehat x)^H\boldsymbol F ^H \boldsymbol F diag(\widehat x)\boldsymbol F^H \\ &=\boldsymbol F diag(\widehat x ^ *)\boldsymbol F^H \boldsymbol F diag(\widehat x)\boldsymbol F^H\\ &= \boldsymbol F diag(\widehat x ^ * \odot \widehat x ) \boldsymbol F^H \tag{3} \end{aligned} 1.其中 x ^ \widehat x ^ * x ^ \widehat x 是共轭关系
    2. x ^ \widehat x 表示 x \vec x 的离散傅里叶变换,即 x ^ = F ( x ) = n F x \widehat x=\mathcal F(\vec x)=\sqrt n \boldsymbol F x
    3. ( A B C ) H = C H B H A H (\boldsymbol A\boldsymbol B\boldsymbol C)^H=\boldsymbol C^H \boldsymbol B^H \boldsymbol A^H
    4. F H F = F F H = I \boldsymbol F^H \boldsymbol F = \boldsymbol F \boldsymbol F^H = \boldsymbol I
    5. d i a g { ( x ^ ) H } = d i a g { ( x ^ ) T } = d i a g ( x ^ ) diag\left\{(\widehat x)^H\right\}=diag\left\{(\widehat x ^ *)^T\right\}=diag(\widehat x ^ *) ----对角矩阵的转置不变
    6. d i a g ( A ) d i a g ( B ) = d i a g ( A B ) diag(\boldsymbol A) diag(\boldsymbol B)=diag(\boldsymbol A \odot \boldsymbol B) ----符号 \odot 表示矩阵element-wise的乘法
  • ( 3 ) (3) 代入 ( 1 ) (1) 得: (4) w ~ = ( X H X + λ I ) 1 X H y = ( F d i a g ( x ^ x ^ ) F H + λ F I F H ) 1 X H y = ( F d i a g ( x ^ x ^ ) F H + F d i a g ( λ ) F H ) 1 X H y = ( F d i a g ( x ^ x ^ + λ ) F H ) 1 X H y = [ ( F H ) 1 d i a g ( x ^ x ^ + λ ) 1 F 1 ] X H y = [ F d i a g ( 1 x ^ x ^ + λ ) F 1 ] X H y \begin{aligned} {\widetilde w} &= (\boldsymbol X^H \boldsymbol X+\lambda \boldsymbol I)^{-1}\boldsymbol X^H \vec y\\ &=(\boldsymbol F diag(\widehat x ^ * \odot \widehat x ) \boldsymbol F^H+\lambda \boldsymbol F \boldsymbol I \boldsymbol F^H)^{-1}\boldsymbol X^H \vec y\\ &=(\boldsymbol F diag(\widehat x ^ * \odot \widehat x ) \boldsymbol F^H+\boldsymbol F diag(\lambda)\boldsymbol F^H)^{-1}\boldsymbol X^H \vec y\\ &=(\boldsymbol F diag(\widehat x ^ * \odot \widehat x+\lambda ) \boldsymbol F^H)^{-1}\boldsymbol X^H \vec y\\ &=[(\boldsymbol F^H)^{-1}diag(\widehat x ^ * \odot \widehat x+\lambda ) ^{-1}\boldsymbol F^{-1}]\boldsymbol X^H \vec y\\ &=[\boldsymbol Fdiag(\frac {1}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol F^{-1}]\boldsymbol X^H \vec y \tag{4} \end{aligned} 1. λ I = λ F F H = λ F I F H \lambda \boldsymbol I = \lambda \boldsymbol F \boldsymbol F^H=\lambda \boldsymbol F \boldsymbol I \boldsymbol F^H
    2. A B C + A D C = A ( B + D ) C \boldsymbol A \boldsymbol B \boldsymbol C+\boldsymbol A\boldsymbol D\boldsymbol C=\boldsymbol A(\boldsymbol B +\boldsymbol D)\boldsymbol C
    3. F F H = I F H = F 1 \boldsymbol F \boldsymbol F^H=\boldsymbol I\Rightarrow \boldsymbol F^H=\boldsymbol F^{-1}
    4. F F H = I F = ( F H ) 1 \boldsymbol F \boldsymbol F^H=\boldsymbol I\Rightarrow \boldsymbol F=(\boldsymbol F^H)^{-1}
    5. d i a g ( λ i ) 1 = d i a g ( 1 λ i ) diag(\lambda_i)^{-1}=diag(\frac{1}{\lambda_i})
  • ( 2 ) (2) 式代入 ( 4 ) (4) 式得: (5) w ~ = [ F d i a g ( 1 x ^ x ^ + λ ) F 1 ] [ F d i a g ( x ^ ) F H ] H y = [ F d i a g ( 1 x ^ x ^ + λ ) F 1 ] [ ( F H ) H d i a g ( x ^ ) H F H ] y = F d i a g ( 1 x ^ x ^ + λ ) F 1 F d i a g ( x ^ ) H F H y = F d i a g ( 1 x ^ x ^ + λ ) I d i a g ( x ^ ) H F H y = F d i a g ( 1 x ^ x ^ + λ ) d i a g ( x ^ ) H F H y = F d i a g ( x ^ H x ^ x ^ + λ ) F H y = F d i a g ( x ^ x ^ x ^ + λ ) F H y \begin{aligned} {\widetilde w} &=[\boldsymbol Fdiag(\frac {1}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol F^{-1}][\boldsymbol F diag(\widehat x)\boldsymbol F ^ H ]^H \vec y\\ &=[\boldsymbol Fdiag(\frac {1}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol F^{-1}][(\boldsymbol F^H)^Hdiag(\widehat x)^H\boldsymbol F^H]\vec y\\ &=\boldsymbol Fdiag(\frac {1}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol F^{-1}\boldsymbol Fdiag(\widehat x)^H\boldsymbol F^H \vec y\\ &=\boldsymbol Fdiag(\frac {1}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol I diag(\widehat x)^H\boldsymbol F^H \vec y\\ &=\boldsymbol Fdiag(\frac {1}{\widehat x ^ * \odot \widehat x+\lambda} ) diag(\widehat x)^H\boldsymbol F^H \vec y\\ &=\boldsymbol Fdiag(\frac {\widehat x^H}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol F^H \vec y\\ &=\boldsymbol Fdiag(\frac {\widehat x^*}{\widehat x ^ * \odot \widehat x+\lambda} ) \boldsymbol F^H \vec y\tag{5} \end{aligned}
  • 继续推导 x ^ = F ( x ) x = F 1 ( x ^ ) ( F 1 ) \widehat x=\mathcal F(x)\Rightarrow x=\mathcal F^{-1}(\widehat x) (\mathcal F^{-1}表示傅里叶逆变换) (6) X = C ( x ) = C ( F 1 ( x ^ ) ) = F d i a g ( x ^ ) F H \begin{aligned}\boldsymbol X=\boldsymbol C(\vec x)&=\boldsymbol C(\mathcal F^{-1}(\widehat x) )\\ &=\boldsymbol F diag(\widehat x)\boldsymbol F ^ H \tag{6} \end{aligned}
  • 结合 ( 5 ) (5) 式和 ( 6 ) (6) 式得: (7) w ~ = C [ F 1 ( x ^ x ^ x ^ + λ ) ] y {\widetilde w} = \boldsymbol C[\mathcal F^{-1} (\frac {\widehat x^*}{\widehat x ^ * \odot \widehat x+\lambda})]\vec y\tag{7}
  • 利用循环卷积性质: (8) F ( X y ) = F [ C ( x ) y ] = x ^ y ^ = F ( x ) F ( y ) \begin{aligned}\mathcal F(\boldsymbol X \vec y)&=\mathcal F[\boldsymbol C(\vec x)\vec y]\\ &=\widehat x ^ * \odot \widehat y\\ &=\mathcal F^*(\vec x)\odot\mathcal F(\vec y)\tag{8} \end{aligned}
  • 结合 ( 7 ) (7) 式和 ( 8 ) (8) 式得: F ( w ~ ) = F ( C [ F 1 ( x ^ x ^ x ^ + λ ) ] y ) = F [ F 1 ( x ^ x ^ x ^ + λ ) ] F ( y ) = ( x ^ x ^ x ^ + λ ) y ^ = ( x ^ ) ( x ^ x ^ + λ ) y ^ = x ^ y ^ x ^ x ^ + λ \begin{aligned}\mathcal F({\widetilde w})&=\mathcal F(\boldsymbol C[\mathcal F^{-1} (\frac {\widehat x^*}{\widehat x ^ * \odot \widehat x+\lambda})]\vec y)\\ &=\mathcal F^*[\mathcal F^{-1} (\frac {\widehat x^*}{\widehat x ^ * \odot \widehat x+\lambda})]\mathcal F(\vec y)\\ &=(\frac {\widehat x^*}{\widehat x ^ * \odot \widehat x+\lambda})^*\odot \widehat y\\ &=\frac {(\widehat x^*)^*}{(\widehat x ^ * \odot \widehat x+\lambda)^*}\odot \widehat y\\ &=\frac {\widehat x\odot \widehat y}{\widehat x ^ * \odot \widehat x+\lambda} \end{aligned} (9) F ( w ~ ) = w ^ = x ^ y ^ x ^ x ^ + λ \Rightarrow \mathcal F({\widetilde w})=\widehat w = \frac {\widehat x\odot \widehat y}{\widehat x ^ * \odot \widehat x+\lambda}\tag{9} 1. w ^ \widehat w 表示 w ~ \widetilde w 的傅里叶变换, w ~ {\widetilde w} 表示 ( w ; b ) (\vec w;b) 即实域空间的分类器参数
    2.由于 x ^ \widehat x ^ * x ^ \widehat x 的共轭,所以 x ^ x ^ \widehat x ^ * \odot \widehat x 是实数,其共轭为本身
    3. \odot 表示对应元素相乘,——表示对应元素相除
  • 由上述推导可得分类器参数 w ~ {\widetilde w} (10) w ~ = F 1 ( w ^ ) = F 1 ( x ^ y ^ x ^ x ^ + λ ) {\widetilde w}={\mathcal F}^{-1}(\widehat w)={\mathcal F}^{-1}(\frac {\widehat x\odot \widehat y}{\widehat x ^ * \odot \widehat x+\lambda})\tag{10} 1. x ^ y ^ \widehat x,\widehat y 分别表示基样本、标签的傅里叶变换
    2.将矩阵运算在傅里叶域转化为点积运算,成其是矩阵求逆运算,大大提高了计算速度
    3.上式为线性回归下利用循环矩阵其滤波器的计算公式

4. 非线性回归滤波器求解

  • 求解方式:找到一个非线性映射函数 φ ( x ) \varphi(x) 使映射后的样本在新空间中线性可分,那么在新空间中就可以使用脊回归来寻找一个分类器 f ( x i ) = w T φ ( x i ) f(\boldsymbol x_i)=\boldsymbol w^T\varphi(\boldsymbol x_i) ,其中 φ ( x i ) \varphi \boldsymbol {(x_i)} 表示对样本 x i \boldsymbol x_i 通过非线性映射函数 φ \varphi 进行变换。
  • 将线性滤波器的解 w \boldsymbol w 用样本的线性组合来表示: w = i α i φ ( x i ) \boldsymbol w=\sum_i \alpha_i {\varphi(\boldsymbol x_i)} 则最优化问题不再是求变量 w w ,而是 α \alpha 。该表达式是在对偶空间中进行的,具体参考SVM相关理论。
  • 线性条件下的回归问题,经过非线性变换后为: f ( z ) = w T z = ( i n α i φ ( x i ) ) T . φ ( z ) = i n α i φ T ( x i ) φ ( z ) = i n α i K ( x i , z ) \begin{aligned} f(z)&=\boldsymbol w^T\boldsymbol z\\ &=(\sum_i^n \alpha_i {\varphi(\boldsymbol x_i)})^T.\varphi(\boldsymbol z)\\ &=\sum_i^n {\alpha_i}\varphi^T(\boldsymbol x_i)\varphi(\boldsymbol z)\\ &=\sum_i^n\alpha_i\mathcal K(\boldsymbol x_i,\boldsymbol {z}) \end{aligned} 1. K ( x , x ) = φ T ( x ) φ ( x ) \mathcal K(\boldsymbol x,\boldsymbol x')=\varphi^T{(\boldsymbol x)}{\varphi(\boldsymbol x')} K \mathcal K 表示核函数,如高斯或多项式
    2. K i j = K ( x i , x j ) K_{ij}=\mathcal K(\boldsymbol x_i,\boldsymbol x_j) K K n × n n \times n 的核矩阵,表示所有样本对的点乘操作
    3. n n 表示训练样本个数
  • 核函数下岭回归的解为: (11) α = ( K + λ I ) 1 y \boldsymbol \alpha=(K+\lambda I)^{-1}\boldsymbol y\tag{11} 1. α \boldsymbol \alpha 为线性组合系数 α i \alpha_i 组成的向量
    2. K K 为核矩阵,其各个元素 K i , j K_{i,j} 如前所述
    3. λ \lambda 为正则化系数
    4.推导过程参考Kernel ridge Regression
  • 定理 1. 给定循环数据 C ( x ) C(\boldsymbol x) ,对于任意的变换矩阵 M M ,如果核函数 K \mathcal K 满足 K ( x , x ) = K ( M x , M x ) \mathcal K(\boldsymbol x,\boldsymbol x')=\mathcal K(M\boldsymbol x,M\boldsymbol x') ,则核矩阵 K K 是循环矩阵,证明如下: K i j = K ( x , x ) = K ( P i x P j x ) \begin{aligned}K_{ij}&=\mathcal K(\boldsymbol x,\boldsymbol x')\\ &=\mathcal K({P^i \boldsymbol x},{P^j \boldsymbol x})\end{aligned} K ( M x , M x ) = K ( P i P i x , P i P j x ) = K ( x , P j i x ) = K ( x , P ( j i ) % n x ) = K i j \begin{aligned}\mathcal K(M\boldsymbol x,M\boldsymbol x')&=\mathcal K(P^{-i}P^i \boldsymbol x,{P^{-i}P^j \boldsymbol x})\\ &=\mathcal K(\boldsymbol x,P^{j-i}\boldsymbol x)\\ &=\mathcal K(\boldsymbol x,P^{(j-i) \% n}\boldsymbol x)=K_{ij} \end{aligned} 1.由上式可看出, K i , j K_{i,j} 只依赖于 ( j i ) (j-i) n n ,所以 K K 为循环矩阵
    2. P P 为置换矩阵,如: P = [ 0 0 0 . . . 1 1 0 0 . . . 0 0 1 0 . . . 0 . . . . . . . . . . . . . . . 0 0 1 . . . 0 ] P=\begin{bmatrix} 0&0&0&...&1\\ 1&0&0&...&0\\ 0&1&0&...&0\\ ...&...&...&...&...\\ 0&0&1&...&0 \end{bmatrix} x 1 = P 0 x = x = [ x 1 , x 2 , . . . , x n ] T \boldsymbol x_1=P^0\boldsymbol x=\boldsymbol x=[x_1,x_2,...,x_n]^T
    x 2 = P 1 x = [ x n , x 1 , . . . , x n 1 ] T \boldsymbol x_2=P^1\boldsymbol x=[x_n,x_1,...,x_{n-1}]^T

    x n = P n 1 x = [ x 2 , x 3 , . . . , x 1 ] T \boldsymbol x_n=P^{n-1}\boldsymbol x=[x_2,x_3,...,x_1]^T
  • K K 为循环矩阵,利用对角化性质,核滤波器表达式 ( 11 ) (11) 变换如下: (12) α = [ C ( k x x ) + λ I ] 1 y = [ F d i a g ( k ^ x x ) F H + λ I ] 1 y = [ F d i a g ( k ^ x x ) F H + λ F I F H ] 1 y = [ F d i a g ( k ^ x x ) F H + F d i a g ( λ ) F H ] 1 y = [ F d i a g ( k ^ x x + λ ) F H ] 1 y = ( F H ) 1 ( d i a g ( k ^ x x + λ ) 1 ) F 1 y = F d i a g ( k ^ x x + λ ) 1 F H y \begin{aligned} \boldsymbol \alpha&=[C({\boldsymbol k ^{\boldsymbol x \boldsymbol x})}+\lambda I]^{-1} \boldsymbol y\\ &=[Fdiag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}})F^H+\lambda I]^{-1} \boldsymbol y\\ &=[Fdiag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}})F^H+\lambda FIF^H]^{-1} \boldsymbol y\\ &=[Fdiag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}})F^H+ Fdiag(\lambda)F^H]^{-1} \boldsymbol y\\ &=[Fdiag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda})F^H]^{-1} \boldsymbol y\\ &=(F^H)^{-1}(diag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda})^{-1})F^{-1} \boldsymbol y\\ &=Fdiag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda)}^{-1}F^H\boldsymbol y \tag{12} \end{aligned} 1. K = C ( k x x ) K=C({\boldsymbol k ^{\boldsymbol x \boldsymbol x}}) k x x {\boldsymbol k ^{\boldsymbol x \boldsymbol x}} 为矩阵 K K 的第一行
    2.由 F F H = I FF^H=I \Rightarrow F H = F 1 F^H=F^{-1}
  • ( 12 ) (12) 两边同时左乘 F H F^H F H α = F H F d i a g ( k ^ x x + λ ) 1 F H y = d i a g ( k ^ x x + λ ) 1 F H y \begin{aligned} F^H \boldsymbol \alpha&=F^HFdiag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda)}^{-1}F^H\boldsymbol y\\ &=diag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda)}^{-1}F^H\boldsymbol y \end{aligned} F H α F^H \boldsymbol \alpha 表示 α {\boldsymbol \alpha} 经傅里叶变换后的共轭转置,则 F H α = [ α ^ ] T F^H\boldsymbol \alpha=[\widehat{\boldsymbol \alpha}^*]^T ,则上式可转换为: [ α ^ ] T = d i a g ( k ^ x x + λ ) 1 F H y = d i a g ( 1 k ^ x x + λ ) [ y ^ ] T = [ d i a g ( 1 k ^ x x + λ ) y ^ ] T \begin{aligned} [\widehat{\boldsymbol \alpha}^*]^T &= diag({\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda)}^{-1}F^H\boldsymbol y\\ &=diag(\frac{1}{{\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda}})[{\widehat \boldsymbol y^*}]^T\\ &=[diag(\frac{1}{{\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda}}){{\widehat \boldsymbol y^*}}]^T \end{aligned} α ^ = d i a g ( 1 k ^ x x + λ ) y ^ \Rightarrow \widehat{\boldsymbol \alpha}^*=diag(\frac{1}{{\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda}}){{\widehat \boldsymbol y^*}} 1. d i a g ( λ ) 1 = d i a g ( 1 λ ) diag(\lambda)^{-1}=diag(\frac{1}{\lambda})
    2. F H y = y ^ F^H\boldsymbol y=\widehat \boldsymbol y^*
    由于一个对角矩阵与一个向量相乘,相当于元素级乘法,因此: α ^ = y ^ k ^ x x + λ \widehat{\boldsymbol \alpha}^*=\frac{{{\widehat \boldsymbol y^*}}}{{\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda}} 等式两边同时取共轭,得: (13) α ^ = y ^ k ^ x x + λ \widehat{\boldsymbol \alpha}=\frac{{{\widehat \boldsymbol y}}}{{\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}+\lambda}}\tag{13} 更一般地,矩阵 K K 中每一行 K i x x = K ( x P i 1 x ) K_i^{\boldsymbol x \boldsymbol x'}=\mathcal K(\boldsymbol x',P^{i-1}\boldsymbol x) 1. i i 为第 K K 的第 i i 行,即在基样本上进行 i 1 i-1 次循环移位
    2. x x 表示基样本,即 K K 的第一行
    3. k ^ x x {\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol x}} 表示 x x 在傅里叶域进行自相关

5 快速检测

  • 由式 ( 11 ) (11) 可得: (14) f ( z ) = K α f(\boldsymbol z) =K {\boldsymbol \alpha}\tag{14} 1.正则化仅用来求解 α \boldsymbol \alpha ,在回归时不需要正则项
    2. z {\boldsymbol z} 表示待检测的图像块
  • 在实用场景下, ( 14 ) (14) 可表示为: (15) f ( z ) = ( K z ) T α f(\boldsymbol z) =(K^{\boldsymbol z})^T \boldsymbol \alpha \tag{15} 1. α \boldsymbol \alpha 为训练好的分类器参数
    2. K z = C ( k x z ) = K ( P i 1 z , P j 1 x ) K^{\boldsymbol z}=C({\boldsymbol k ^{\boldsymbol x \boldsymbol z}})=\mathcal K(P^{i-1} \boldsymbol z,P^{j-1} \boldsymbol x) ,表示训练样本和待检测样本之间的核矩阵,是一个非对称矩阵
    3. x \boldsymbol x 表示待训练基样本, z \boldsymbol z 表示待检测基样本
  • 继续 ( 15 ) (15) 式推导 (16) f ( z ) = [ C ( k x z ) ] T α = F d i a g ( ( k ^ x z ) ) F H α = C ( ( k x z ) ) α \begin{aligned} f(\boldsymbol z) &=[C({\boldsymbol k ^{\boldsymbol x \boldsymbol z}})]^T \boldsymbol \alpha\\ &=Fdiag({ (\widehat\boldsymbol k ^{\boldsymbol x \boldsymbol z}})^*)F^H\boldsymbol \alpha\\ &=C(({\boldsymbol k ^{\boldsymbol x \boldsymbol z}})^*)\boldsymbol \alpha \tag{16} \end{aligned} 注:上述推导利用了循环矩阵的转置性质,即转置后的特征值与原特征值互为共轭,用公式表达为: X T = F d i a g ( ( x ^ ) ) F H X^T=Fdiag((\widehat\boldsymbol x)^*)F^H 其中矩阵 d i a g ( ( x ^ ) ) diag((\widehat\boldsymbol x)^*) 对角线上的值为矩阵 X T X^T 的特征值
  • ( 16 ) (16) 式两边同时进行傅里叶变换,得: F ( f ( z ) ) = F ( C ( ( k x z ) ) α ) \mathcal F(f(\boldsymbol z))=\mathcal F(C(({\boldsymbol k ^{\boldsymbol x \boldsymbol z}})^*)\boldsymbol \alpha) 利用循环卷积性质(如式 ( 8 ) (8) 所示),得: F ( f ( z ) ) = F ( ( k x z ) ) F ( α ) = F ( k x z ) F ( α ) \begin{aligned} \mathcal F(f(\boldsymbol z))&=\mathcal F ^ *(({\boldsymbol k ^{\boldsymbol x \boldsymbol z}})^*) \odot \mathcal F(\boldsymbol \alpha)\\ &=\mathcal F ({\boldsymbol k ^{\boldsymbol x \boldsymbol z}}) \odot \mathcal F(\boldsymbol \alpha) \end{aligned} 即: (17) f ^ ( z ) = k ^ x z α ^ \widehat f(\boldsymbol z)={\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol z}} \odot \widehat \boldsymbol \alpha \tag{17} 1. k x z { \boldsymbol k ^{\boldsymbol x \boldsymbol z}} 为核矩阵 K K 的第一行
    2. k ^ x z {\widehat \boldsymbol k ^{\boldsymbol x \boldsymbol z}} 为训练样本 x x 与待测样本 z z 在傅里叶域的核相关

6 快速核相关

6.1 点积与多项式核

  • 对于某种点积核函数 g g ,其核函数可表示为: K ( x , x ) = g ( x T x ) \mathcal K(\boldsymbol x,\boldsymbol x')=g(\boldsymbol x^T\boldsymbol x') 注: g g 表示输入向量间的元素级操作 k i x x = K ( x , P i 1 x ) = g ( x T P i 1 x ) {k_i^{\boldsymbol x\boldsymbol x'}}=\mathcal K(\boldsymbol x',P^{i-1}\boldsymbol x)=g(\boldsymbol {x'}^TP^{i-1}\boldsymbol x) k x x = g ( C ( x ) x ) ( ) k^{\boldsymbol x \boldsymbol x'}=g(C(\boldsymbol x)\boldsymbol x') (证明略) 由循环矩阵性质可知: F ( C ( x ) x ) = x ^ x \mathcal F(C(\boldsymbol x)\boldsymbol x')=\widehat \boldsymbol x^* \odot \boldsymbol x'
    C ( x ) x = F 1 ( x ^ x ^ ) \Rightarrow C(\boldsymbol x)\boldsymbol x' = \mathcal F^{-1}(\widehat \boldsymbol x^* \odot \widehat \boldsymbol x') k x x = g ( F 1 ( x ^ x ^ ) ) \Rightarrow k^{\boldsymbol x \boldsymbol x'}=g(\mathcal F^{-1}(\widehat \boldsymbol x^* \odot \widehat \boldsymbol x'))
  • 特殊地,对于多项式核 K ( x , x ) = ( x T x + a ) b \mathcal K(\boldsymbol x,\boldsymbol x')=(\boldsymbol x^T\boldsymbol x'+a)^b (18) k x x = ( F 1 ( x ^ x ^ ) + a ) b \Rightarrow k^{\boldsymbol x \boldsymbol x'}=(\mathcal F^{-1}(\widehat \boldsymbol x^* \odot \widehat \boldsymbol x')+a)^b\tag{18}

6.2 径向基函数与高斯核

  • 对于某种径向基函数 h h ,其核函数可表示为: K ( x , x ) = h ( x x 2 ) \mathcal K(\boldsymbol x,\boldsymbol x')=h(||\boldsymbol x-\boldsymbol x'||^2) k i x x = K ( x , P i 1 x ) = h ( x P i 1 x 2 ) = h ( x 2 + P i 1 x 2 2 x T P i 1 x ) = h ( x 2 + x 2 2 x T P i 1 x ) = h ( x 2 + x 2 2 F 1 ( x ^ x ^ ) ) \begin{aligned} k_i^{\boldsymbol x \boldsymbol x'}=\mathcal K(\boldsymbol x',P^{i-1}\boldsymbol x)&=h(||\boldsymbol x'-P^{i-1}\boldsymbol x||^2)\\ &=h(||\boldsymbol x'||^2+||P^{i-1}\boldsymbol x||^2-2\boldsymbol x'^TP^{i-1}\boldsymbol x)\\ &=h(||\boldsymbol x'||^2+||\boldsymbol x||^2-2\boldsymbol x'^TP^{i-1}\boldsymbol x)\\ &=h(||\boldsymbol x'||^2+||\boldsymbol x||^2-2\mathcal F^{-1}(\widehat \boldsymbol x^*\odot\widehat\boldsymbol x')) \end{aligned} 注:上式转换中去除了转换矩阵 P i 1 P^{i-1} ,因为矩阵的循环移位不影响其范数
  • 特殊地,对于高斯核 K ( x , x ) = e x p ( 1 σ 2 x x 2 ) \mathcal K(\boldsymbol x,\boldsymbol x')=exp(-\frac{1}{\sigma^2}||\boldsymbol x-\boldsymbol x'||^2) (19) k x x = e x p ( 1 σ 2 ( x 2 + x 2 2 F 1 ( x ^ x ^ ) ) ) \Rightarrow k^{\boldsymbol x \boldsymbol x'}=exp(-\frac{1}{\sigma^2}(||\boldsymbol x'||^2+||\boldsymbol x||^2-2\mathcal F^{-1}(\widehat \boldsymbol x^*\odot\widehat\boldsymbol x')))\tag{19}

7 伪代码

//训练分类器alphaf
function alphaf = train(x, y, sigma, lambda)
	k = kernel_correlation(x, x, sigma);
	alphaf = fft2(y) ./ (fft2(k) + lambda);
end
//计算响应f(z)
function responses = detect(alphaf, x, z, sigma)
	k = kernel_correlation(z, x, sigma);
	responses = real(ifft2(alphaf .* fft2(k)));
end
//计算核相关矩阵
function k = kernel_correlation(x1, x2, sigma)
	c = ifft2(sum(conj(fft2(x1)) .* fft2(x2), 3));
	d = x1(:)*x1(:) + x2(:)*x2(:) - 2 * c;
	k = exp(-1 / sigma^2 * abs(d) / numel(d));
end

猜你喜欢

转载自blog.csdn.net/u013187057/article/details/83897549