Machine learning algorithm is equivalent to it?

In the current recommendation system, a lot of learning to use the machine, some have used the depth of learning. So, machine learning algorithm is equivalent to a pile of it?

The answer is: ≠ machine learning algorithm.

≠ machine learning algorithm

When we open a textbook or university syllabus, list a bunch of algorithms are usually seen.

This also allows you caused such a misunderstanding: machine learning is to have a series of algorithms. In fact, the machine does not stop at learning algorithm, we can see it as a comprehensive solution to the problem. We see a separate algorithm, but is the problem of the iceberg, the rest of the challenge is how do we use these algorithms correctly.

Machine learning why so magical?

Machine learning, data analysis is to teach computer and found the law to predict people to practice or decision.

For machine learning in the true sense, the computer must have the ability to program data can not be obtained by analysis of the law.

Example:

If a child playing at home, suddenly, he saw a candle! So he walked slowly toward the candle.
Here Insert Picture Description

Out of curiosity, he was pointing a finger at candlelight;
"wow!" He shouted, and hand back to;
"! Hum ...... something that will glow red and pressed"
Two days later, he went to the kitchen to see to the stove. Also, he was very curious.

He was very, very curious, and my heart do not want to touch;
suddenly, he found this thing shines, is red!
"Ah ......" he said to himself, "I do not once again the pain!"
He remembered the red and shining things will "pain", so he left to go to other parts of the stove.
Here Insert Picture Description
Put it more clearly, because the child inferred from their own candles in some kind of conclusion that we call "machine learning."

The conclusion is: "will glow red and the" means "pain";

If the child away from the furnace, because parents warned him of, then it is a "clear indication of the program" rather than a machine learning.

Important Terms

Model - a set of patterns derived from the data;

Algorithm - the process of training for a specific ML model;

Training data - data collection algorithm used to train the model;

Test Data - a new dataset for an objective evaluation of the performance of the model;

Wherein - the data set used to train the model variables;

Target variable - a particular variable for prediction;

Example:
Here Insert Picture Description

假设我们有一组包含150个小学生信息的数据集,现在希望通过他们的年龄、性别和体重预测他们的身高。

我们现在有150组数据点、1个目标变量(身高)、3个特征(年龄、性别、重量)。接下来会把所有数据分为两个子集:

其中,120组会被用来训练不同的模型(训练集),其余的30组用来选择最佳模型(测试集)。

机器学习任务

在学术界,机器学习始于并会一直专注于其中某个算法。但是,在工业界,我们首先得为工作所需选择正确的机器学习任务。

· 任务是算法的特定目标。

·只要选择正确的任务,算法就可以交换进出完成任务。

·实际上,我们会尝试多种不同算法,因为很可能我们一开始不知道哪种算法最适合数据集。

机器学习两种最常见的任务类别是监督学习和无监督学习。

监督学习

监督学习包括面向“标记”好的数据的任务(换言之,我们有一个目标变量)。

· 在实践中,它通常是用作建模预测的高级形式。

· 每一组数据点必须正确标记。

· 只有这样才能建立一个预测模型,因为我们必须在训练时告诉算法什么是“正确”的(也就是我们说的“监督”)。

· 回归是建模连续目标变量的任务。

·分类是对分类目标变量进行建模的任务。
Logistic regression

无监督学习

无监督学习包括面向“未标记”数据的任务(换言之,没有目标变量)。

· 在实践中,这种形式通常用作自动数据分析或自动信号提取。

· 未标记的数据没有预先确定的“正确答案”。

· 允许算法直接从数据中学习模式(即没有“监督”)。

· 聚类是最常见的无监督学习任务,用于查找数据中的组。

Clustering

机器学习的三要素

如何始终如一地构建有效的模型以获得最佳效果。

#1:熟练的厨师(人类指导)

首先,即使我们是在“教电脑自学”,但在这个过程中,人的指导也起着很大的作用。

正如我们所看到的,您需要在此过程中做出无数项决策。

事实上,第一个重大决策就是该如何规划我们的项目,从而确保成功。

#2:新鲜食材(干净且相关的数据)

第二个基本要素是数据的质量。

无论我们使用哪种算法,垃圾输入=垃圾输出。

专业的数据科学家将大部分时间花在了解数据,清理数据和设计新功能上。

#3:不要过度烹饪(避免过度拟合)

One of machine learning is the most dangerous trap of over-fitting. Over-fitting model will "remember" the training set of noise, rather than learning the real basics mode.

· Hedge Funds overfitting can cost millions of dollars.
· Overfitting in the hospital could lead to thousands of deaths.

For most applications, overfitting is to avoid mistakes.

Guess you like

Origin blog.51cto.com/13945147/2440246