sklearn.svm.OneClassSVM User Manual (Chinese)

class sklearn.svm.OneClassSVM(kernel='rbf', degree=3, gamma='scale', coef0=0.0, tol=0.001, nu=0.5, shrinking=True, cache_size=200, verbose=False, max_iter=-1)

Unsupervised outlier detection.
Estimate high-dimensional distribution support.
This implementation is based on libsvm. You can learn more about anomaly detection
in the user manual .

Parameters (kernel, gamma, nu and tol are more important)

kernel: String, optional (default = 'rbf'),
kernel function, string type, optional, default is 'rbf', that is, Gaussian kernel
specifies the type of kernel to be used in the algorithm. It must be one of "linear", "poly", "rbf", "sigmoid", "precomputed" or callable. If not given, "rbf" will be used. If a callable object is given, it is used to pre-compute the kernel matrix.

degree: int, optional (default = 3)
order, shaping, optional.
The order of the polynomial kernel function ("poly"). Ignored by all other kernels

gamma: {'Scale', 'auto'} or float, optional (default = 'scale')
kernel coefficient, {'scale', 'auto'} or floating point type, optional, default is 'scale'
"rbf" , "Poly" and "Sigmoid" kernel coefficients.
If gamma = 'scale' (the default) is passed, it will use 1 / (n_features * X.var ()) as the gamma value,
if it is 'auto', 1 / n_features will be used.
Changed in version 0.22: The default value of gamma is changed from 'auto' to 'scale'.

coef0: Float, optional (default = 0.0)
internal parameter, floating point type, optional, default is
independent parameter in 0.0 kernel function. It only makes sense for "poly" and "sigmoid".

toll: Float, optional
stop standard metrics

not: Float, optional
upper limit of training error score, lower limit of support vector score. Should be in the interval (0, 1). By default, 0.5 is taken.

shrinking: Boolean, optional
Boolean value, optional
whether to use the reduced heuristic method.

cache_size: Float, optional
specifies the size of the kernel cache (in MB).

verbose: Bool, default: False
enables detailed output. Please note that this setting takes advantage of the runtime setting of each process in libsvm. If this setting is enabled, the setting may not work properly in a multi-threaded context.

max_iter: Int, optional (default = -1)
sets a hard limit on the iteration within the solver, or -1 (no limit).

Attributes

support_: Column vector (n_SV)
support vector index

support_vectors_: Matrix (n_SV, n_features)
support vector

dual_coef_: Line vector, shape = [1, n_SV]
The weight assigned to the feature (the coefficient of the original problem). Only available in the case of linear cores.
coef_ is a read-only attribute derived from dual_coef_ and support_vectors_

intercept_: Row vector, shape = [1,]
constant in the decision function.

offset_: Floating point
offset is used to define the decision function based on the original score. We have the following relationship: Decision_function = score_samples-offset_. This offset is opposite to intercept_, and is provided for consistency with other outlier detection algorithms.

fit_status_: Integer
If the installation is correct, it is 0, otherwise it is 1 (a warning will be issued)

example

from sklearn.svm import OneClassSVM
X = [[0], [0.44], [0.45], [0.46], [1]]
clf = OneClassSVM(gamma='auto').fit(X)
clf.predict(X)
##array([-1,  1,  1,  1, -1])
clf.score_samples(X)  # doctest: +ELLIPSIS
##array([1.7798..., 2.0547..., 2.0556..., 2.0561..., 1.7332...])

method

decision_function(self, X):
Distance to the separating hyperplane.
If the sample is inside the separation hyperplane, the distance is positive, otherwise the distance is negative.
Parameters:
X: shape (n_samples, n_features)
return value:
dec: shape (n_samples,)
return sample decision function

fit(self, X, y=None, sample_weight=None, **params):
Soft boundary
parameters of the detection sample set X :
X : shape (n_samples, n_features)
sample set, where n_samples is the number of samples and n_features is the number of features.

sample_weight : shape (n_samples,)
weight per sample. Recalibrate C for each sample. Higher weights force the classifier to emphasize these points more.

y : ignored,
unused, and API consistency provided by convention.

Return
self: object

Note
If X is not a continuous array in C order, copy it.

fit_predict(self, X, y=None):
Perform fitting on X and return the label of X.
Returns -1 for outliers and 1 for values ​​within the group.

Parameter
X: shape (n_samples, n_features)
input data

y: ignore

get_params(self, deep=True):
Get the parameters of this estimate.

Parameter
deep: Boolean value, the default is True.
If it is True, the estimator and the parameters of the sub-object containing the estimator will be returned.

Returns : the parameter name is mapped to its value.

predict(self, X):
Perform classification on the samples in X. For a type of model, return +1 or -1.

Parameter
X: shape = (n_samples, n_features)
For kernel = "precomputed", the expected shape of X is [n_samples_test, n_samples_train]

Return
the category label of X

score_samples(self, X)
Sample's original scoring function.

Parameter
X: (n_samples, n_features)

Return
Returns the (unshifted) scoring function of the sample.

set_params(self, **params):
Set the parameters of this estimator.
This method is suitable for simple estimators and nested objects (such as pipes).

Parameter
** params: estimator parameters

Returns
an example of an estimator.

Published 41 original articles · praised 13 · visits 6692

Guess you like

Origin blog.csdn.net/comli_cn/article/details/103898791