First, the principle set forth
Algorithm type: supervised learning classification algorithm _
Input: numeric or nominal type (required nominal type hot encoded)
V1.0
Solving regression binary classification manner, by introducing a Sigmoid function y intermediate values the actual value of y is mapped onto two categories.
Second, the algorithm selects
Third, the algorithm process
X is a function 1.Sigmoid range is (-∞, + ∞), y is the range (0,1) is a monotonically increasing function;
2. predicted value y> 0.5 class 1 <class 0 to 0.5, y value may also be interpreted as a probability as 0 and class 1;
3. Similarly using the "least squares" concept, to obtain the best fit equation, to give the objective function;
4. to the objective function is minimized, need called " gradient descent "algorithm, which process is substantially as follows: on the hyperplane a similar mountains, from any point of view, computing partial derivatives, advances a certain distance along the partial derivatives of the negative direction (referred to as" learning rate "), until changes after the initial point of difference from the mobile small (referred to as "convergence") so far.
Metrics: least squares objective function: the least squares objective function solution: gradient descent
Fourth, the characteristics of
Advantages: simple, easy to understand and implement; computational cost is not high, fast, low storage resources.
Disadvantages: easy underfitting, classification accuracy may not be high. Outliers and missing values sensitive
Fifth, the code API