[] Machine learning machine learning foundation

0 Preface

  Recorded in this blog learning machine learning process some gains, be it in stumbling forward, I thank the people who helped in the learning process, thanks! At the same time in the learning process also read a lot of blog, to learn a lot of knowledge, very grateful to his / her selfless sharing. Because learning time a little long, recently began to organize the summary, so far as possible indicate knowledge of the source; if not marked, are very sorry, I can add links.

1. The basic flow

  A basic machine learning process can be summarized as follows:

                                                

 

                                             

  Data collection is well understood, we need to find such a data set; data cleansing refers to the data collected "unclean", the presence of noise, missing values, etc., the need for certain data processing will be back in "linear regression" examples of particularly stressed the importance of data cleaning; engineering refers to the characteristic features of the true distribution of data extracted from the data reflect, in theory, can be directly applied after data cleaning data modeling, so why the need to characterize it works ? Mainly because if you do not feature project, easily lead to "curse of dimensionality", poor performance of the model; and because cleaning is not independent, there is a characteristic feature of redundancy between the data, so after a further feature of the project to extract more in favor of feature-based modeling; data modeling it, if it is regression, regression model is established, if it is established classification classification model; performance evaluation refers to the evaluation indicators to determine how to apply some kind of model performance of our training get, common indicators mean square error, recall, precision, PR curve, AUC values, and so on. Machine learning the basic process can be liken to a case in tomato scrambled eggs. Data collection corresponds to our purchase eggs and tomatoes from the food market, due to the time to buy Debu pick carefully, some bad, was on his clean-up, if you delete abnormal noise data cleaning; feature works on the corresponding cleaned sliced tomatoes, eggs, stir well, this is a process for further processing; everything is ready, you can begin a fried tomato and egg, it corresponds to machine learning is the data modeling, cooking a variety of postures like modeling using different models, algorithms and optimization methods. Before the pot, you need to taste the flavor, just as good or bad performance evaluation model performance is determined. Examples of the basic processes and from [1].

2. Basic assumptions

  In the traditional machine learning, in order to ensure that the model is trained with a higher accuracy and reliability, there are two basic assumptions:

    (1) There must be sufficient available, clean training samples to be able to learn a good classification model.

    (2) training samples for learning and new test samples meet the independent and identically distributed.

  How to understand these two assumptions it? Machine learning is a data-driven approach, rely on data to build mathematical models, machine learning session circulating this statement: "Data and characteristics determine the upper limit of the machine learning algorithms, and models and algorithms just keep it close to this limit." before designing a machine learning algorithm to fully consider some of the characteristics of data, such as whether there is a missing sample values, whether the category balance, or absence of noise, whether the feature to normalize and so on, it is worth noting that these two assumptions the assumption about the data, so we can see the importance of good data.

3. Basic Elements

  Machine learning is learning from the law limited observation data (or "guess") having the general rule, and can be summed up to promote the use of non-observed sample. Machine learning methods can be roughly divided into three basic elements: model, learning standards, optimization methods. [2]

  Model refers to establish what kind of mathematical model, regression model or a classification model based on the actual problems; learning criterion refers to establish what kind of function or loss of function called a target, such as the most commonly used functions mean square error, 0 1 loss function, deep learning classification commonly used in cross-entropy loss function, and so on. Refers to a method of optimizing the minimum or maximum for solving the objective function, which is most commonly a gradient descent method, but because in many cases, the objective function is not a convex function, so using a gradient descent method does not necessarily converge to the value of the most ,

4. Theorem

(1) There is no free lunch theorems [2]

  The absence of a machine learning algorithm suitable for any areas or tasks for specific issues requiring specific analysis. For the iterative optimization algorithm based on an algorithm does not exist is valid for all issues (limited search space). If the algorithm is valid for a certain problem, then it must be better than pure random search algorithm on some other problems worse. In other words, not be divorced from the specific issues to talk about the merits of the algorithm, any algorithm have limitations.

(2) Occam's Razor

  "If not necessary, do not by entity." If the model is similar to two properties, we should choose a simpler model.

(3)丑小鸭定理
  “丑小鸭与白天鹅之间的区别和两只白天鹅之间的区别一样大”。在没有假设或者先验知识的情况下,我们没有理由偏爱任何一组特征表示,而忽略其它特征表示。

(4)最短描述长度定理
  要求模型的复杂度和该模型描述训练数据的描述长度之和最小化。

       

  以决策树模型为例,决策树中节点的个数可视为模型的复杂度,所有叶节点上数据的熵的加权和可以用来描述模型训练数据的复杂度,最短描述长度定理要求模型的复杂度和模型描述训练数据的描述长度之和最小,也就是说在性能指标相差无几的情况下,决策树越“精炼”,叶节点数据熵加权和越小、越纯,也许这样的决策树模型会更好。

 5.总结

  上面就是我对机器学习的一些总结和认识,为了方便记忆,归纳成“一个流程、两个假设、三个要素、四个定理”,在理解了这些,在选择机器学习模型和算法的时候,也许能提供一些帮助,而不是完全按照经验来选择。

参考文献

[1]邹博.机器学习(网课)

[2]邱锡鹏.神经网络与深度学习

[3]周志华.机器学习[M].北京:清华大学出版社.

Guess you like

Origin www.cnblogs.com/chen-hw/p/11525739.html