PaddlePaddle entry -1

(In their own way to understand DL Note: hereinafter, basically Baidu PaddlePaddle training)


1) The concept of artificial intelligence and the relationship \ machine learning and deep learning of

Artificial intelligence, machine learning and deep learning concept is very hot in recent years, but many practitioners have difficulty explaining the relationship between them, the layman is smoke and mirrors. Depth learning to learn, we need to start with the concept of reform from the bottom three.

Three technical areas are covered by the layer decreases, and artificial intelligence is the broadest concept, machine learning is one way to achieve artificial intelligence , is currently a more effective way. Deep learning is a branch of machine learning algorithms hottest in recent years made significant progress, and replace most of the traditional machine learning algorithms. Therefore, the relationship between the three available view showing, Artificial Intelligence> Machine Learning> deep learning.

The concept range between artificial intelligence, machine learning and deep learning three

Between the scope of the concept of artificial intelligence, machine learning, and three deep learning: FIG 1

 

As the literal meaning, artificial intelligence research and development for simulation, scientific theories extension and expansion of human intelligence, methods, techniques and applications. Because this definition only describes the goal, but not limited approach. Therefore, many methods to realize the presence of AI and the branch, which leads into a "mixed bag" type of subject.

In contrast, machine learning, especially in supervised learning, there are more specific to refer to. Machine learning is specializing in computer simulation or how to achieve human learning behavior to acquire new knowledge or skills, re-organize existing knowledge structures so as to continuously improve their performance. This sentence a little "muddle" feeling, people talking.

Following the "machine learning knowledge from Newton's second law experiment" for the case, to give readers in-depth understanding under machine learning (supervised learning) in the end is what kind of technical methods. Research methodology and process of human machine learning have the same purpose. Common expression of Newton's second law: acceleration of an object is directly proportional with the size of the force is inversely proportional with the quality of the object, and proportional to the reciprocal of the mass of the object. The law is the Isaac Newton in 1687 in "Mathematical Principles of Natural Philosophy" a book made. Newton's second law of motion and the first, third law came together to form Newton's laws of motion, explained the basic laws of motion in classical mechanics.

In the secondary school textbooks, Newton's second law There are two ways of designing experiments: Method sliding inclined wire and the horizontal method.

Both methods Newton's second law experiment

Figure 2: two methods Newton's second law Experiment

 

I believe that many readers are playing with pulleys and small pieces of wood to do the experiment's and Sentimental memories. Based on several experiments, different statistics to acceleration forces block table below.

 

Table 1: a large number of data samples and plotting the experimental observations obtained

project X force Acceleration Y
1st 4 2
2nd 5 2.5
... ... ...
N-th 6 3

 

Observation above experimental data, it is easy to guess the relationship between force and acceleration of an object should be a linear relationship. Therefore, we hypothesized that a = w * F, a representative of the acceleration, F for force, w is the parameter to be determined.

By training a large number of experimental data to determine the parameter w is the reciprocal of the mass of the object (1 / m), to obtain a complete model equation a = w * (1 / m). When an object is known acting force based on the model can be easily predicted acceleration of the object. For example, fuel rocket thrust force F = 10, the rocket mass m = 2, is obtained rocket acceleration a = 5.

This interesting case demonstrates the basic machine learning process, but have not yet achieved a critical point clear, that is, how to determine the parameters of the model (w = 1 / m)?

At least learn to determine the parameters of the process is similar to the way scientists hypothesis reasonable hypothesis can explain all existing observations. If in the future the observed data does not meet the new hypothesis, people will try to put forward a new hypothesis. The history of astronomy, using a combination of great circle and the small circle calculated as a celestial body is in the Middle Ages fit the observed data. But with the progress of the European machinery industry, astronomical observation equipment began to grow, more and more observational data can not be applied existing theories. This facilitates the use of ellipse calculation celestial body of theoretical hypotheses emerge. Therefore, the basic conditions for the effective model is the ability to fit the known samples, which provides us with implementation effective learning model. H is assumed in the model, which is a function of the parameter [theta] and the input X, is represented by H (θ, X). Optimization target model is such that H (θ, X) as far as possible identical to the actual output of the input Y, i.e., the degree of difference of the evaluation function model that is the effect (difference between the smaller the better). Then, the learning process parameters is known in the sample, decreasing the evaluation function (H (θ, X) and Y phase difference) of the process, until the learning value of a parameter [theta] so as to minimize the evaluation function. This model predictive value and measure the gap between the true value of the evaluation function is also called loss function (loss Loss). Optimization of the process parameters as the above-described formula shown in FIG.

The method of determining the parameters of the study

The method of determining a parameter learning Figure 3:

 

举例类比,机器如一个机械的学生一样,只能通过尝试答对(最小化损失)大量的习题(已知样本)来学习知识(模型参数w),期望用学习到的知识w组成完整的模型H(θ,X),能回答不知道答案的考试题(未知样本)。最小化损失是模型的优化目标,实现损失最小化的方法称为优化算法,也称为寻解算法(找到使得损失函数最小的参数解)。参数θ和输入X组成公式的基本结构称为假设。在牛顿第二定律的案例中,基于对数据的观测,我们提出的是线性假设,即作用力和加速度是线性关系,用线性方程表示。由此可见,模型假设,评价函数(损失/优化目标)和优化算法是构成一个模型的三个部分

机器学习算法理论在上个世纪90年代发展成熟,在诸多领域也取得了应用效果。但平静的日子过到2010年左右,深度学习模型的异军突起,极大改变了机器学习的应用格局。在今天,多数机器学习任务均可以使用深度学习模型解决。在语音,计算机视觉和自然语言处理等领域,深度学习模型的效果比传统机器学习算法有显著提升。

那么,深度学习又怎样对机器学习的算法结构提出了改进呢?其实两者的理论结构是一致的,也存在模型假设,评价函数和优化算法,最根本的差别在于假设的复杂度上。

"Beauty" function converts the image pixels from the original concept to advanced semantic complexity unimaginable!

图4:从原始图片像素到高级语义概念“美女”的函数转换的复杂度难以想象!

 

如上图所示,不是所有的任务均如牛顿第二定律那样简单直观。对于一张图片,人脑接收到五颜六色的光学信号,计算机则接收到一个数字矩阵。人脑以极快的速度反应出这张图片是一位美女,而且是程序员喜欢的类型。这个结果是一个非常高级的语义概念,从像素到高级语义概念中间要经历怎样复杂的信息变换是难以想象的!这种变换已经复杂到无法用数学公式表达,所以研究者们借鉴了人脑神经元的结构,设计出神经网络的模型。

人工神经网络包括多个神经网络层(卷积层、全连接层、LSTM等),每一层又包括很多神经元,超过三层的非线性神经网络都可以被成为深度神经网络。通俗的讲,深度学习的模型可以视为是输入到输出的映射函数,比如中文到英文的映射,足够深的神经网络理论上可以拟合任何复杂的函数,因此,神经网络非常适合学习样本数据的内在规律和表示层次,对文字\图像和声音任务有很好的适用性,因为这几个领域的任务是人工智能的基础模块,所以深度学习被称为实现人工智能的基础也就不足为奇了

机器学习的实现步骤可以分成两步,训练和预测。这两个专业名词类似于归纳和演绎的含义。归纳是从具体案例中抽象一般规律,机器学习中的“训练”亦是如此。从一定数量的样本(已知模型输入X和模型输出Y)中,学习出输出Y与输入X的关系(可以想象成是某种表达式)。演绎则是从一般规律推导出具体案例的结果,机器学习中的预测亦是如此。基于训练得到的Y与X之间的关系,遇到新出现的输入X,计算出输出Y。在多数时候,通过模型计算得到的输出,如果和真实场景中的输出一致,说明模型是有效的。


2)深度学习的历史和发展

 

究竟神经网络是怎样的设计?先不用着急,在下一章会以一个“房价预测”的案例,演示使用Python实现神经网络模型的细节。在进入实现细节之前,让我们回顾下深度学习的悠久的历史和今日的蓬勃发展。

 

神经网络思想的提出已经是75年前的事情了,现今的神经网络和深度学习的设计理论是一步步的完善的。在这漫长的发展岁月中,有一些取得关键突破的闪光时刻。其中有1960年代,基本网络结构设计完善后的黄金时代,也有在1969年异或问题被提出后(人们惊奇的发现神经网络模型连简单的异或问题也无法解决),神经网络模型被束之高阁的黑暗时代。虽然在1986年,新提出的多层的神经网络解决了异或问题,但随着90年代后理论更完备并且实践效果更好的SVM等机器学习模型的兴起,神经网络并未得到重视。真正的兴起是在2010年左右,基于神经网络模型改进的技术在语音和计算机视觉任务上大放异彩,也逐渐被证明在更多的任务(自然语言处理以及海量数据的任务)上有效。至此,神经网络模型重新焕发生机,并有了一个更加响亮的名字:深度学习。

 

Depth learning has a long history, but mature in 2010 after

 

图5:深度学习有悠久的发展历史,但在2010年后才逐渐成熟

 

为何神经网络到2010年后才焕发生机,这与深度学习成功所依赖的先决条件有关。

 

  1. 大数据是它有效的前提。神经网络和深度学习是非常强大的模型,但也需要足够量级的训练数据。时至今日,很多传统机器学习算法和人工特征依然是足够有效的方案,原因在于很多场景下没有足够的标记数据来支撑深度学习这样强大的模型。深度学习的能力特别像科学家托罗密的豪言壮语:“给我一根足够长的杠杆,我能撬动地球!”,它也可以发出类似的豪言:“给我足够多的数据,我能够学习任何复杂的关系”。但在现实中,足够长的杠杆与足够多的数据一样,往往只能是一种美好的愿景。直到近些年,各行业IT化程度提高,累积的数据量爆发式的增长,才使得应用深度学习模型成为可能。
  2. 依靠硬件的发展和算法的优化。现阶段依靠更强大的计算机,GPU,Autoencoder预训练和并行计算等技术,深度网络在训练上的困难已经被逐渐克服。其中,数据量和硬件是更主要的原因。没有前两者,科学家们想优化算法都无从进行。

 

早在1998年,一些科学家就已经使用神经网络模型识别手写字母图像了。但深度学习在计算机视觉应用上的兴起,还是在2012年ImageNet比赛上,使用AlexNet做图像分类。如果比较下98年和12年的模型,会发现两者在网络结构上非常类似,仅在一些细节上有所优化。在这十四年间计算性能的大幅提升和数据量的爆发式增长,促使模型完成了从“简单的字母识别”到“复杂的图像分类”的跨越。

 

虽然历史悠久,但深度学习在今天依然在蓬勃发展,一方面基础研究快速进展,另一方面工业实践层出不穷。

 

如下图所示,基于深度学习的顶级会议ICLR(international conference on learning representations)统计,深度学习相关的论文数量呈逐年递增的状态。同时,不仅仅是深度学习会议,与数据和模型技术相关的会议ICML和KDD,专注视觉的CVPR和专注自然语言处理的EMNLP等国际会议的大量论文均涉及着深度学习技术。该领域和相关领域的研究方兴未艾,技术仍在不断创新突破中。

 

The number of papers related to the depth of learning has increased steadily

 

图6:与深度学习相关的论文数量逐年攀升

 

另一方面,以深度学习为基础的人工智能技术在升级改造众多的传统行业,存在极其广阔的应用场景。下图选自艾瑞咨询的研究报告,人工智能技术不仅可在众多行业中落地应用(广度),在部分行业(如安防)已经实现了市场化变现和高速增长(深度)。

 

Depth study based on AI technology is widely used in various industries, resulting in huge economic value

 

图7:以深度学习为基础的AI技术在各行业广泛应用,产生巨大经济价值

 

除了应用广泛的特点外,深度学习还推动人工智能进入工业大生产阶段,算法的通用性导致标准化、自动化和模块化的框架产生。此前,不同流派的机器学习算法理论和实现均不同,导致每个算法均要独立实现,例如随机森林和支撑向量机(SVM)。但在深度学习框架下的诸多算法结构有较大的通用性,例如常用与计算机视觉的卷积神经网络模型(CNN)和常用于自然语言处理的长期短期记忆模型(LSTM),均可以分为组网模块,梯度下降的优化模块,预测模块等。这使得抽象出统一的框架成为了可能,并大大降低了编写建模代码的成本。一些相对通用的模块,如网络基础算子的实现,各种优化算法等均可以由框架实现。建模者只需要关注数据处理,配置组网的方式,以及用少量代码串起训练和预测的流程即可。

 

Depth learning model has universal characteristics that can be standardized, automated and modular

 

图8:深度学习模型具有通用性特点,可以标准化、自动化和模块化

 

在深度学习框架出现之前,机器学习工程师处于手工业作坊生产的时代。为了完成建模,工程师需要储备大量数学知识,并为特征工程工作积累大量行业知识。每个模型是极其个性化的,建模者如同手工业者一样,将自己的积累形成模型的“个性化签名”。而今,“深度学习工程师”进入了工业化大生产时代。只要掌握深度学习必要但少量的理论知识,掌握Python编程即可以在深度学习框架实现极其有效的模型,甚至与该领域最领先的实现模型不相上下。建模这个被“老科学家”们长期把持的建模领域面临着颠覆,也是新入行者的机遇。

 

Depth study engineer in industrial production era, "senior scientists" long-term accumulation of advantage is no longer secure

 

图9:深度学习工程师处于工业化大生产时代,“老科学家”长期积累的优势不再牢固

 

每个人的生命都是宝贵的,我们经常说要将有限的时间浪费在有价值的事情上。为何要学习深度学习技术,以及通过这本书来学习呢?一方面,深度学习的应用前景广阔,是极好的发展方向和职业选择。另一方面,本书会使用国产的深度学习框架飞桨(PaddlePaddle)来编写实践案例,基于框架的编程让深度学习变得易学易用。

 

Published an original article · won praise 0 · Views 24

Guess you like

Origin blog.csdn.net/zaihuan_yu/article/details/104105754