Basics of Neural Network

Basics of Neural Network

1. Binary classification——logistic regression

Logistic regression model:

  • Problems: Given x, want \(\hat{y}=P \left\{y=1|x\right\} (0 \leq \hat{y} \leq 1)\)
  • Parameters: \(\omega\in R^{n_x},b \in R\)
  • Output: \(\hat{y}=sigmoid(\omega^Tx+b)\)

Training sets:(\(m\) is the number of training sets)

\(\left\{(x^{(1)},y^{(1)}),(x^{(2)},y^{(2)}),…,(x^{(m)},y^{(m)})\right\}\)

​ the i_th sample input vector: \(x^{(i)}\) is a (\(n_x\),1) vector

​ the input matrix : \(X=[x^{(1)},x^{(2)},…,x^{(m)}]\)

​ the output matrix : \(Y=[y^{(1)},y^{(2)},…,y^{(m)}]\)

2.Cost function——to train parameters \(\omega\) and \(b\)

loss function : \(L(\hat y,y)=-(ylog(\hat y)+(1-y)log(1-\hat y))\)

cost function : \(J(\omega,b)=\frac{1}{m}\sum\limits_{k=1}^m{L(\hat y^{i},y^i)}=-\frac{1}{m}\sum\limits_{k=1}^m{[y^ilog \hat y^i +(1-y^i)log(1-log \hat y^i)}\)

Cost function is used to measure how well you're doing an entire training set.

Note : \(L(\hat y,y)=\frac{1}{2}{(\hat y-y)}^2\) this form is not a good choice, when using gradient descent, the problem will be non-convex.

3.Gradient descent——to make the cost function \(J\) as small as possible

Repeat : \(\omega =\omega - \alpha \frac{dJ(\omega)}{d\omega}\)

(\(\alpha\) is the learning rate)

猜你喜欢

转载自www.cnblogs.com/mticket/p/9177734.html