First, the advanced optimization algorithms:
Write a function that can return the cost function and gradient values:
Second, multivariate classification: one to many
Consider a training set as shown below, to classify the data set will be a binary classification problem into three separate,
The formation of new 'pseudo' training set:
The triangle as 'positive' category, the square as a 'positive' category, as the star 'n' category, fitting three classifiers.
: When a given value of θ and x, y = 1, 2, the probability is much
When we predict a new x, three classifiers are input x, and then select the largest category h, corresponding to get maximum h y is the value of what we need.