sklearn official document 1.5.7

scikit-learn to learn the official document translation

1.5.7 mathematical formula
Source: https://scikit-learn.org/stable/modules/sgd.html#mathematical-formulation

Given a set of training examples ( x 1 , y 1 ) , , ( x n , y n ) , (x_1,y_1),\cdot\cdot\cdot,(x_n,y_n), Which x i R m x_i \in R^m , and y i { 1 , 1 } y_i \in \{-1,1\} , to learn the target function f ( x ) = w T x + b f(x)=w^Tx+b . Training to learn by minimizing the error obtained optimal parameter w , b w,b : E ( w , b ) = 1 n i = 1 n L ( y i , f ( x i ) ) + α R ( w ) E(w,b)=\frac{1}{n}\sum_{i=1}^{n}L(y_i,f(x_i))+\alpha R(w) wherein L L is the experience of loss function, R R is a regularization term (penalty term).
usually L L alternative forms comprising:

  • Hinge function (Hinge): support vector machine (soft margin)
  • Logarithmic function (log): Return logic
  • Least square difference (Least-Suqares): Ridge Regression
  • Epsilon sensitive soft :( interval) SVR
    all of the above functions of these losses can be seen as 0 1 0-1 upper limit of the loss function, as shown below:
    Here Insert Picture Description
    a regularization term R R generally three options:
    L2 of regularization: R ( w ) = 1 2 i = 1 n w i 2 R(w)=\frac{1}{2}\sum_{i=1}^{n}w_i^2 L1 regularization (produced sparse solution): R ( w ) = i = 1 n w i R(w)=\sum_{i=1}^n|w_i| Resilient network (Elastic Net, between L1 and L2) R ( w ) = ρ 2 i = 1 n w i 2 + ( 1 ρ ) i = 1 n w i R(w)=\frac{\rho}{2}\sum_{i=1}^{n}w_i^2+(1-\rho)\sum_{i=1}^{n}|w_i| The following figure shows for when R ( w ) = 1 R(w)=1 , the three different regularization parameters of the spatial profile items:
    Here Insert Picture Description
    Postscript:
    loss function is machine learning three elements (models, strategies, algorithms), the core of a strategy. Minimize loss of the desired function (aka, also known as the hazard function) is the goal of machine learning. Risk function, including the experience of risk and structural risk (regularization term, aka penalty term) in two parts.
    According to deepen understanding, it will gradually improve.

Guess you like

Origin blog.csdn.net/houhuipeng/article/details/93770050