"Data Mining Concepts and Techniques" Chapter 9 Classification: Advanced Methods

This chapter follows the basic classification methods introduced in the previous chapter, and introduces advanced methods of classification:

Bayesian network

In Bayesian networks, independence between variables is not emphasized. Recognize dependencies between variables or attributes.
In a given network topology diagram, a given concept.
An arc goes from node Y to Z, then Y is a parent or immediate predecessor of Z, and Z is a descendant of Y.
给定其双亲,每个变量条件独立于图中它的非后代。
This also means that for each variable, the probability of the variable taking a value should be considered under the combination of all possible values ​​of her parent node.
That is, P ( Y ∣ Parents ( Y ) ) P(Y|Parents(Y))P ( Y P a r e n t s ( Y ) ) is
thus given, the conditional probability table CPT corresponding to each variable or attribute.

家族患病历史
肺癌
吸烟情况
肺气肿
X光片异常
呼吸困难

The above is the relationship diagram of the attributes of lung cancer patients.
For 肺癌example, the CPT of this variable is:

- A, E A, -E -A, E -A, -E
B 0.8 0.5 0.7 0.1
-B 0.2 0.5 0.3 0.9

Then we can get:
P ( B = yes ∣ A = yes , E = yes ) = 0.8 P(B=yes|A=yes,E=yes)=0.8P(B=yesA=yes,E=yes)=0.8
P ( B = n o ∣ A = n o , E = n o ) = 0.9 P(B=no|A=no,E=no)=0.9 P(B=noA=no,E=no)=0.9 Using the above example , we can also answer the probability of an empirical query (
使用贝叶斯网络进行分类时,并不是返回单个类标号,而是返回概率分布,给出每个类的概率。
given a person having problems with X-rays and breathing difficulties, what is the probability that he will have lung cancer) and the most likely explanation of the query (which groups of people are Most likely X-ray problems and breathing difficulties).
Steps to train a Bayesian network
:

  1. Calculate the gradient
  2. Take a small step in the direction of the gradient
  3. renormalize weights

backward propagation

Backpropagation is an introduction to neural networks, including feedforward neural networks and backpropagation to update weights and biases.
While applying backpropagation to classification, including binary classification and multi-classification, depends on the function of the output layer (softmax or relu)
Advantages
This method can be used when there is a lack of knowledge of the relationship between attributes and classes, and it is very suitable for continuous Input and output of values .
Disadvantages
Get local minima.

SVM Support Vector Machine

Both the specific interpretation and derivation should be learned in machine learning.
In classification, SVM is a method to classify linear and nonlinear data.
SVM was originally for linearly separable data to find a "best" separation line. Multidimensional is the best separation plane.
And what is the best? i.e. the hyperplane with the largest edge.
The concept of support vectors is introduced, that is, they are the same distance from the maximum edge hyperplane.
Then the problem is transformed 如何求解支持向量和MNH.
Due to the transformation of low-dimensional data to high-dimensional data, many dot product calculations will be encountered, which requires a large amount of calculation and costs a lot. So proposed 核函数.
The training of the SVM will yield 全局解.

Classification using frequent patterns

  • A priori CBA
  • FP-Growth CMAR
  • DDPMIne mines a collection of highly discriminative frequent patterns directly from frequent pattern trees.

lazy learning

K-nearest neighbor classification

The result depends on the distance calculation.

case-based reasoning

Other classification methods

Genetic Algorithm

In the genetic algorithm, each rule is represented by a binary string, assuming a given sample has two Boolean attributes A1, A2, and there are two classes C1 and C2. i.e.
if A 1 and NOTA 2 = > then C 2 if A_1 and NOTA_2 => then C_2i f A1andNOTA2=>thenC2
Can be encoded with the binary string "100".
The population of rules then "evolves" through crossover and mutation operations until all rules in the population satisfy the specified threshold.
For the concepts of crossover, mutation, evolution, and fitness measurement in genetic algorithm, you can refer to the following:
https://zhuanlan.zhihu.com/p/33042667
Genetic algorithm can also be used to evaluate the fitness of other algorithms.

rough set method

Rough set theory can be used for classification to find structural connections within inaccurate or noisy data . It is used for discrete-valued attributes, so continuous-valued attributes must be discretized before use.
The same can be used for attribute subset selection (feature reduction, identifying and removing attributes that do not contribute to the classification of a given training data)
correlation analysis (assessing the contribution or significance of each attribute).
用来近似定义类,这些类基于可以用的属性是不可区分的。

Fuzzy set method

Fuzzy set theory is very useful for some numerical attributes. For classification and association rule mining, the same sample is allowed to belong to multiple classifications, and subsequent mining is performed from the probability value of the classification result.
用隶属度函数替换连续值属性的“脆弱的”阈值。

classification problem

Multi-classification problem

In SVM, we know that it is mainly for binary classification problems, if it is for multi-classification problems.
SVM can use one-to-many and many-to-many strategies.

  1. One-to-many, in all samples, only one category is classified at a time, marked as positive samples, and the rest are negative samples. And so on until all positive samples.
  2. Many-to-many, for a sample set of M classifications, construct M(M-1)/2 SVM classifiers, um, this is more sensitive to errors and has a higher overhead.

Semi-supervised classification

  • Self-training
    Self-classification of unlabeled classified samples is prone to reinforcement errors.
  • Co-training
    Co-reference training is performed through classifiers trained on different features, but it is difficult to divide features into mutually exclusive, class-condition-independent sets.

active learning

Active learning is an iterative supervised learning that may purposely ask the user for the class label to train the model.
Most of the research in active learning is on how to choose the tuple to be asked

transfer learning

Extract knowledge from one or more source tasks and apply that knowledge to target tasks. TrAdaBoost is an instance-based transfer learning method that reweights certain tuples from the source task and uses them to learn the target task, thus requiring only a few labeled target task tuples.
The transfer learning will follow the learning of jupyter resources.
https://github.com/jindongwang/transferlearning#0latest
https://github.com/dipanjanS/hands-on-transfer-learning-with-python/tree/master/notebooks

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326386625&siteId=291194637