Derivation of Normal Equation

Normal Equation Official:

θ = (XTX)1XTAND


The derivation process:

Start:

Xθ = Y

Step 1: Multiply Left X T
XTXθ=XTAND

Step 2: Left multiply (X T X) -1
(XTX)1XTXθ=(XTX)1XTAND

among them:
(XTX)1(XTX)=I

and so:
θ = (XTX)1XTAND

Gradient Descent method:

  • Need to choose a learning rate a
  • Need multiple iterations
  • When the number of features is large, the efficiency is still high

Normal Equation method:

  • No need to choose a learning rate a
  • No need to iterate
  • Not suitable for situations with a large number of features, because matrix multiplication is required

Guess you like

Origin blog.csdn.net/michael_f2008/article/details/78403637