Parameters in Logistic Regression (Detailed Explanation)

LogisticRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='liblinear', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1)

Detailed parameter explanation:

1. penalty : str type, the choice of regularization items. There are two main types of regularization: l1 and l2, and the default is l2 regularization.

'liblinear' supports l1 and l2, but 'newton-cg', 'sag' and 'lbfgs' only support l2 regularization.

2.dual：bool(True、False), default：False

If it is True, it will solve the dual form, only when penalty='l2' and solver='liblinear', there is a dual form; usually when the number of samples is greater than the number of features, the default is False, and the original form is solved.

3.tol ： float, default：1e-4

The criterion for stopping the solution, the default error does not exceed 1e-4, stop further calculations.

4.C ：float，default：1.0

The reciprocal of the regularization coefficient λ; must be a float greater than 0. Like SVM, the smaller the value, the stronger the regularization, usually the default is 1.

5.fit_intercept：bool(True、False)，default：True

Whether there is an intercept, the default is True.

6.intercept_scaling ：float，default ：1.0

Only useful when using solver as "liblinear" and fit_intercept=True. In this case, x becomes [x, intercept_scaling], i.e. "synthetic" features with a constant value equal to intercept_scaling are appended to the instance vector. The intercept becomes intercept_scaling * synthetic_feature_weight. Note: Synthetic feature weights are subject to l1/l2 regularization like all other features. To reduce the effect of regularization on the synthetic feature weights (and thus the intercept), intercept_scaling must be increased. It is equivalent to man-made a feature, the feature is always 1, and its weight is b.

7.class_weight ：dict or ‘balanced’，default：None

The class_weight parameter is used to indicate the various types of weights in the classification model. It can be omitted, that is, the weights are not considered, or all types of weights are the same. If you choose to input, you can choose balanced to let the class library calculate the type weight by itself, or we can input the weight of each type by ourselves. For example, for the binary model of 0,1, we can define class_weight={0:0.9, 1:0.1}, This way type 0 has a weight of 90% and type 1 has a weight of 10%.

If class_weight selects balanced, then the class library will calculate the weight based on the training sample size. The larger the sample size of a certain type, the lower the weight, and the smaller the sample size, the higher the weight. When class_weight is balanced, the class weight calculation method is as follows: n_samples / (n_classes * np.bincount(y)) (n_samples is the number of samples, n_classes is the number of categories, np.bincount(y) will output the number of samples of each class) .

8.random_state：int，default：None

Random number seed, int type, optional parameter, the default is none, it is only useful when the regularization optimization algorithm is sag, liblinear.

9.solver ：‘newton-cg’，‘lbfgs’，‘liblinear’，‘sag’，'saga'，default：liblinear

liblinear: The open source liblinear library is used to implement it, and the coordinate axis descent method is used internally to iteratively optimize the loss function.

lbfgs: A kind of quasi-Newton method, which uses the second-order derivative matrix of the loss function, namely the Hessian matrix, to iteratively optimize the loss function.

newton-cg: It is also a kind of Newton method family, which uses the second derivative matrix of the loss function, namely the Hessian matrix, to iteratively optimize the loss function.

sag: Stochastic average gradient descent, which is a variant of the gradient descent method. The difference from the ordinary gradient descent method is that only a part of the samples are used to calculate the gradient in each iteration, which is suitable for when there are many sample data.

saga: A linearly convergent stochastic optimization algorithm.

For small datasets 'liblinear' can be chosen, while 'sag' and 'saga' are faster for large datasets.

For multi-class problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multiple losses; 'liblinear' is limited to one loss (that is, when using liblinear, if it is a multi-category problem, get First take one category as one category, and all the remaining categories as another category. By analogy, traverse all categories and classify.).

The three optimization algorithms 'newton-cg', 'lbfgs' and 'sag' only deal with the L2 penalty (these three algorithms require the first or second order continuous derivative of the loss function), while 'liblinear' and 'saga' can handle L1 and L2 penalties.

10.max_iter：int ，default:100

Only available for newton-cg, sag and lbfgs solvers. The maximum number of iterations for the solver to converge.

11.multi_class：str，{‘ovr’, ‘multinomial’}，default：‘ovr’

'ovr': use one-vs-rest strategy, 'multinomial': directly use multi-class logistic regression strategy.

If you choose ovr, you can choose the 4 loss function optimization methods liblinear, newton-cg, lbfgs and sag. But if you choose multinomial, you can only choose newton-cg, lbfgs and sag.

12.verbose：int，default：0

Log verbosity, used to turn on/off the log output in the middle of iteration.

13.warm_start：bool(True、False)，default：False

Hot start parameter, if it is True, then use the previous training result to continue training, otherwise start training from scratch. Useless for liblinear solvers.

14.n_jobs：int，default：1

Parallel number, int type, the default is 1. When 1, use one core of the CPU to run the program, and when 2, use 2 cores of the CPU to run the program. When -1, use all CPU cores to run the program.

Summarize:

The purpose of Logistic regression is to find the best fitting parameters of a nonlinear function Sigmoid, and the solution process can be completed by an optimization algorithm.

Parameters in Logistic Regression (Detailed Explanation)

Summarize:

Guess you like