Centering input
The centering is achieved by changing the input (\ mathtt {X} \) \ origin remove the bias input, the input obtained even after conversion \ (\ mathtt {Z} \ ) zero-mean
The average of the input \ (\ bar {\ mathtt { x}} = \ frac {1} {N} \ mathtt {X ^ T1} \)
- \[\mathtt x=\begin{bmatrix} {x_0}\\ {x_1}\\ {\vdots}\\ {x_d}\\ \end{bmatrix}\] \(\mathtt x\in R^{(d+1)*1}\)
- \[\mathtt{X}=\begin{bmatrix}{\mathtt{x_1^T}}\\{\mathtt{x_2^T}}\\{\vdots}\\{\mathtt{x_N^T}}\\\end{bmatrix}\] \(\mathtt X\in R^{N*(d+1)}\)
- \(\mathtt{1}=\begin{bmatrix}{1}\\{1}\\{\vdots}\\{1}\\\end{bmatrix}\) \(\mathtt 1\in R^{N*1}\)
Converted input \ (\ mathtt {z_n = x_n- \ bar {x}} \)
Or \ (\ mathtt {Z = X -1 \ bar {x} ^ T} \)
After converting the inputted zero mean proof
\(\mathtt{\bar{z}=\frac{1}{N}Z^T1=\frac{1}{N}X^T1-\frac{1}{N}\bar{x}1^T1=\bar{x}-\frac{1}{N}\bar{x}N}=0\)
Enter standardization
Is centered on the input \ (\ mathtt {X} \ ) feature scaled so that the input converted \ (\ mathtt {Z} \ ) standard deviation 1 wherein each
The following discussion centering on the basis of the establishment
Standard deviation calculation \ (\ Sigma = \ sqrt {\ FRAC {. 1} {N} \ sum_ {n-=. 1} ^ N (X_ {I} - \ bar {X}) ^ 2} \) , because they have given heart ( \ (\ bar {X} = 0 \) ), so \ (\ sigma = \ sqrt { \ frac {1} {N} \ sum_ {n = 1} ^ Nx_ {i} ^ 2} \)
Standard deviation \ (\ sigma_i = \ sqrt { \ frac {1} {N} \ sum_ {n = 1} ^ Nx_ {ni} ^ 2} \)
After transformation \ (\ mathtt {z_n} = \ begin {bmatrix} {x_ {n1} / \ sigma_1} \\ {\ vdots} \\ {x_ {nd} / \ sigma_d} \\\ end {bmatrix} = \ mathtt {Dx_n} \)
- D is a diagonal matrix, \ (\ mathtt {_ D} = {II}. 1 / \ sigma_i \)
Or \ (\ mathtt {Z = XD } \)
After a proof is converted standard deviation
\(\sigma_i(\mathtt{z})=\sqrt{\frac{1}{N}\sum_{n=1}^Nz_{ni}^2}=\sqrt{\frac{1}{N}\sum_{n=1}^N\frac{x_{ni}^2}{\sigma_i^2}}=\sqrt{\frac{1}{\sigma_i^2}*(\frac{1}{N}\sum_{n=1}^Nx_{ni}^2)}=1\)
Enter Albino
If you enter a high correlation between features, so a separate punishment for the different characteristics of difficult to do this when you do regularization, whitening effect is to reduce the correlation between features, while all the features have the same variance
Albino is the same so that the input of each dimension is important, dimension reduction is a measure of the importance of input dimensions and then discarded unimportant dimensions, so it should not be whitened after dimensionality reduction
The following discussion centering on the basis of the establishment
Covariance matrix \ (\ mathtt {C = \ frac {1} {N} \ sum_ {n = 1} ^ {N} x_nx_n ^ T} = \ frac {1} {N} X ^ TX \)
- \ (C_ {ij} = cov (x_i, x_j) \) covariance describes \ (x_i \) and \ (x_j \) Correlation
- \ (cov (x, y) = E (xy) -E (x) E (y) = E (xy) \) there has been centered, so I \ (E (x) = E (y) = 0 \ )
- \(\mathtt{x_nx_n^T}=\begin{bmatrix}{x_1x_1}&{x_1x_2}&{\cdots}&{x_2x_d}\\{x_2x_1}&{x_2x_2}&{\cdots}&{x_2x_d}\\{\vdots}&{\vdots}&{\ddots}&{\vdots}\\{x_dx_1}&{x_dx_2}&{\cdots}&{x_dx_d}\\\end{bmatrix}\) 这下看懂了吧
Converted input \ (\ mathtt Z_n = {C ^ {- \ FRAC. 1} {2} {x_n}} \) (where prescribing matrix I do not really understand)
Or \ (\ mathtt {Z = XC ^ {\ frac {1} {2}}} \)
Covariance matrix of the whitened input
\(\mathtt{\frac{1}{N}Z^TZ=C^{-\frac{1}{2}}(\frac{1}{N}X^TX)C^{-\frac{1}{2}}=C^{-\frac{1}{2}}CC^{-\frac{1}{2}}=(C^{-\frac{1}{2}}C^{\frac{1}{2}})(C^{\frac{1}{2}}C^{-\frac{1}{2}})=E}\)
Finally, to obtain a matrix, i.e. \ (CoV (x_i, x_j) = \ Cases the begin {}. 1, 0 \\ J = I, I \ J} {NEQ \ Cases End {} \) , indicating that each input feature associated only with itself, not associated with other features