Machine Learning (MACHINE LEARNING) [Zhou Zhihua version - "Watermelon book" - notes] DAY1- Introduction

Here Insert Picture Description

I love this book because the cover looks good. PS: focus on planning textbook! Requires a certain basis, Jingdong genuine sale ~

1.1 Introduction

机器学习:例如我们人吃过,看过很多西瓜,
所以基于色泽。根蒂,敲声这几个特征我们就可以做出相当好的判断。

1.2 Related Terms

挑几个重要的说一下 所有的术语类似一种抽象 其他的自己理解一下
从数据中学得模型的过程称为”学习“(learning)或”训练“(training),这个过程通过执行某个学习算法来完成。
训练过程中使用的数据称为“训练数据”(tarining data),其中每一个样本称为一个“训练样本”(tarining sample),训练样本组成的集合称为“训练集”(training set)。
学得模型对应了关于数据的某种潜在规律,因此亦称为“假设”(hypothesis);
这种潜在规律自身,则称为“真相”或“真实”(ground-truth),学习过程就是为了找出或逼近真相。
有时将模型称为“学习器”(learner),可看作学习算法在给定数据和参数空间上的实例化。

1.3 hypothesis space

归纳(induction)与演绎(deduction)是科学推理的两个大基本手段。前者是从特殊到一半的泛化(generalization)过程,即从具体的事实归结出一半性规律;后者则是从一般到特殊的“特化”(specialization)过程,即从基础原理推演出具体状况。

 

归纳学习有狭义和广义之分,广义的归纳学习大体相当于从样例中学习,而狭义的归纳学习则要求从训练数据中学得概念(concept),因此亦称为“概念学习“或”概念形成“。

 

概念学习中最基础的是布尔概念学习,即对“是”“不是”这样的可表示为0/1布尔值的目标概念的学习。

 

我们可以把学习过程看作一个在所有假设(hypothesis)组成的空间中进行搜索的过程,搜索目标是找到与训练集“匹配”的假设,即能够将训练集中的瓜判断正确的假设。假设的表示一旦确定,假设空间及其规模大小就确定了。


Below, watermelon hypothesis space problem:
Here Insert Picture Description
when the operating hypothesis space, there may be a number of strategies for this hypothesis search space, for example, top-down, from general to specific, or bottom-up, from the specific to the general , the search process can continue on to delete and positive examples inconsistent assumptions, and (or) and negative examples consistent with the hypothesis. The final will receive training set consistent with the hypothesis (ie, all training samples can be correctly judge), and this is what we learn of the results. It should be noted that the real problem we often face a lot of hypothesis space, but the learning process is carried out based on a limited sample of the training set, and therefore, there may be multiple hypotheses consistent with the training set, that is consistent with the existence of a training set "suppose collection," which we call "space version" (version space). For example, watermelon problems, the training set and 1.1 corresponding to the version of FIG space as shown below
Here Insert Picture Description,
1.4 inductive preference
for a particular learning algorithm, it is necessary to produce a model. This time, the learning algorithm itself "preference" will play a key role. For example, if our algorithms like "as special" model, it will choose "good melon <-> (= color ) ∧ (pedicle = curl) ∧ (voiced loud knocking sound =)"; but if our algorithms like "as normal" model, and for some reason it is more "believe," pedicle, it will choose "good melon <-> (= color ) ∧ (pedicle = curl) ∧ (knocking sound = * ). " Machine learning algorithms in the learning process preference for a certain type of hypothesis, known as the "induction preference" or simply "preference"

Here Insert Picture Description
In this figure, a plurality of curves is consistent with the presence of a limited training set.
Induction preference can be seen as a learning algorithm itself may be very large hypothesis space on the assumptions in the selection heuristic or "values." Well, there is no general principles to guide the algorithm to establish the "right" preferences it? "Occam's Razor" is a common, natural science research in the basic principle that "if more than one observation is consistent with the hypothesis, then choose the easiest."

In fact, summed preferences correspond to assumptions about "what kind of model better" learning algorithm itself made. In specific real-world problems, this assumption is satisfied that the induction preference algorithm matches the problem itself, most of the time directly determines the algorithm can achieve good performance.

Suppose ξa learning algorithm based on a model produced summarized Preference corresponding to curve A, ξb learning algorithm based on another induction curve B corresponding to the generated preference model
Here Insert Picture Description
1.5 Development of

machine learning is the inevitable outcome of AI research and development to a certain stage, in fact, Turing in 1950 article about the Turing test, had referred to the possibility of machine learning.
Machine learning has now developed into a fairly large area.

1.6 application status
Here Insert Picture Description
today, in many branches of computer science, whether it is multimedia, graphics, or network communication, software engineering, as well as architecture, chip design, can be found in the figure of machine learning techniques.
Machine Learning also provides important technical support for a number of interdisciplinary.
1.7 Readings

外文资料库有很多机器学习相关论文和期刊(不推荐中文数据库),感兴趣的读者都可以阅读掌握。

Published 545 original articles · won praise 129 · views 40000 +

Guess you like

Origin blog.csdn.net/weixin_43838785/article/details/104180380