Machine Learning Week3

Classification (classification problems)

y=0 or 1

Regression analysis / logic analysis (logistic regression):

Target: Let h (x) is located between [0,1]

Logic Function / S type function:

image:

x = 0, y = 0.5; x = positive infinity, y = 1; x = negative infinity, y = 0;

Probability angle: P (y = 0 | x; θ) + P (y = 1 | x; θ) = 1, P (y = 1 | x; θ) represents probability at a given value of x y =.

The figure shows, is such that h> 0 (y = 1), will z> 0, theta is so defined decision boundary, the training set for the fitting parameters theta

Many questions (One-vs-all)

If a classification, then the problem is divided into n 1/0 simple classification. Each simple classification problem, class 1 represents a class n, 0 represents the remaining n-1 class.

Calculated for each simple classification probability y = 1, n is the probability of the final probability of the classification of the maximum probability.

Advanced optimization algorithm (Advanced Optimization)

Such as: BFGS (variable metric method), L-BFGS (limiting variable metric method), Conjugate gradient (conjugate gradient method)

advantage

  • Instead of manually selecting α, internal linear search algorithm intelligence (line search), can automatically try a variety of α
  • Faster than the gradient descent
  • Only drawback complex

Using advanced optimization algorithms in Matlab

1
2
3
4
function  [jVal, gradient] = (Theta)
jVal = [... code to Compute J (Theta) ...];
large column   machine learning Week3 S = "Line"> gradient = [... derivative of code to Compute J (Theta) ...];
End
1
2
3
Options = optimset ( 'GradObj' , 'ON' , 'MaxIter' , 100 ); % storing a data structure for option, 'GradObj', 'on' setting gradient of the objective parameters open, 'MaxIter', 100 maximum number of iterations is on. 
= initialTheta zeros ( 2 , . 1 ); initial guess% theta values of
[optTheta, functionVal, exitflag] = fminunc (@costFunction, initialTheta, Options); % fminunc unconstrained minimization function, @ costFunction pointer to point.

Results operation has converged represents exitFlag = 1, then the value is close to 0 functionVal

θ must be two-dimensional and above column vectors

Overfitting (overfitting)

FIG unfitted represents a (underfitting) or HIgh bios (high bias); Figure III represents showing overfitting or high variance (hegh variance).

Definition: In the case of a lot of data, the curve may have been a good fit curve. But you can not generalize (generate) the new data.

Solution: 1. reduce the number of variables selected. More importantly, selected manually or with variable automatic model selection algorithm to select variables.

2. regularization (regularization)

Save all variables, but reduce the size or magnitude of θ (j) of

normalization

By 'penalty' certain parameters, it can be made closer to the curve suitable curve. In order to better normalization operation, to select the proper parameters λ.

Similarly, changing the normalized J (θ), will be changed and the normal equation gradient descent algorithm [theta] recursive. In the normal equation algorithm, as long as λ> 0, the matrix is ​​invertible.


End text


Tucao point: with the chrome and Evernote almost two years, only to find Evernote plug-chrome scissors hidden how useful today, the basic text on coursera can be a key collection ah, do not always get hard to take notes formula transfer pictures.

Probably since you can write notes a lot less of it (all kinds of lazy).

Guess you like

Origin www.cnblogs.com/lijianming180/p/12099688.html