Supervised learning
Unsupervised Learning: Learning Structured Knowledge
reinforcement learning
Supervised Learning:
Linear regression model: output y is continuous
Logistic regression model (not actually a regression problem, but a classification problem): the output y is 0, 1 discrete
Logistic regression model:
Sigmoid function: turns any input into an output between 0 and 1, also used to represent probability
softmax function: turns multiple inputs into an output that sums to 1,
picture
And the cost function has changed, becoming a special function for binary classification or cross entropy (to measure the similarity of two distributions)
Gradient descent method: w=wa(dcost/dw) makes the cost smaller
Backpropagation Algorithm (Chain Rule)
There are various optimization algorithms, but the gradients of each parameter are calculated first. The next section introduces various optimization algorithms