How to choose the appropriate kernel function when using support vector machine for classification?

Andrew Ng Theory 1: When the amount of data is large enough and there are enough features, the final effect of all classification algorithms is similar. That is to say, no matter what kind of kernel you choose, if the training set is large enough, it will be the same. Of course, in terms of classification effect, the nonlinear kernel is better than the linear kernel. However, linear can also have a very good classification effect, and the amount of calculation is smaller than that of nonlinear, so it needs to be analyzed in detail.

Andrew Ng Theory 2: Honest Andrew teaches you how to choose the right SVM kernel.
Case 1 : When the training set is small and there are many features, use a linear kernel. Because in the case of multiple features, the linear kernel can already provide a good variance to fit the training set.
Case 2 : When the training set is relatively sizable and the features are relatively few, use a nonlinear kernel. Because the algorithm needs to provide more variance to fit the training set.
Case 3 : There are few features, the training set is very large, and a linear kernel is used. Because nonlinear kernels require too much computation. The huge training set itself can provide a good classification effect for the nonlinear kernel.

Author: Qiu Baqi
Link : https://www.zhihu.com/question/33268516/answer/57436016
Source: Zhihu The
copyright belongs to the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.


When I was doing an internship, I asked a top student in the mathematics college of a 985 college this question, and his answer was:

When they deal with the problem of how to choose a kernel function for SVM, they have a relatively mature method to determine which kernel function is suitable for the current data set, and more often, this "kernel function suitable for the current data set" is not a current Some achievements need to be programmed by themselves.

After listening to it, I instantly worshipped in my heart! !

Later, through the practice, I slowly discovered that the choice of machine learning algorithms in the business field is not inclined to choose high-level machine learning algorithms such as SVM (they are like black boxes, although they are awesome, but you can't be very good to explain the results of its work, such as which variables are more important in all feature sets, how they affect the variation of the dependent variable, etc.). On the contrary, the two algorithms, LR and Decision Tree, are quite popular.

If the landlord is not using SVM for academic purposes, there is no need to delve into which kernel function is more suitable?
The first reason has been mentioned above.
The second reason is that there are only a few ready-made kernel functions of SVM, and you can find the relatively optimal ones by trying them one by one. BUT, but I understand that SVM is overly dependent on parameters. In addition to over-reliance on the appropriate kernel function, SVM also over-relies on the specific parameters under the current kernel function.

Author: Zhihu User
Link : https://www.zhihu.com/question/33268516/answer/57579127
Source: Zhihu The
copyright belongs to the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324607358&siteId=291194637