Programmers, AI is here, the opportunity is here, and the crisis is here

This article is reprinted from http://blog.csdn.net/myhaspl/article/details/74928747?ref=myread


Programmers, AI is here, the opportunity is here, and the crisis is here

1. Artificial intelligence is really here


        Throughout the past and present, few computer technologies have a long development life, and most of them are short-lived, such as: DOS, windows3.2, foxpro, delphi, 80x86 assembly in the past, and many technologies are also struggling, such as: VB, PB, Sqlserver, and even Microsoft's .NET were forced to a dead end by the powerful open source forces from around the world, so that they had to open source and cast an olive branch to Linux in order to continue to grow and develop.

        In the late autumn of 2013, a day with high autumn air, the big data at this time in China has already begun to move, Hadoop as a big data platform for offline computing has become popular, and machine learning has begun to sow seeds in China, but these are only limited to large companies and universities. Internally, the land of China is not full of artificial intelligence . At the invitation of the Machinery Industry Press, I wrote "A Practical Guide to Machine Learning " (the book was published in 2014 and is currently in its second edition). After writing this book, I have a strong hunch: artificial intelligence and machine learning will definitely become popular in the future, and it will definitely become the longest-lived technology in the history of computer technology development, possibly for hundreds or even thousands of years. As for human immigration to other planets, artificial intelligence will still prosper ~, but I never imagined that in less than 3 years after the book was published, artificial intelligence has already blossomed everywhere in China, and its development speed is beyond my imagination. Wu Enda's public resignation letter has already explained the current state of artificial intelligence development in China .

       "Just as electricity and electricity changed many industries a hundred years ago, artificial intelligence is also changing major industries such as medical care, transportation, entertainment, and manufacturing, enriching and enriching the lives of countless people. As for where artificial intelligence will lead us, I am more than I used to be excited and looking forward to it. I am honored to have learned from the AI ​​community of China and the United States, the two AI powerhouses. The United States is good at creating new technologies and ideas, while China is good at using AI technology to develop Good product. I am very happy that I have the opportunity to work and contribute to the rise of AI development both in China and in the United States"

                                                                                                                 --Former Baidu Chief Scientist Wu Enda 

      

         Pay attention to the content of the above letter "China is a big country of artificial intelligence". As programmers, the opportunity has come to make their own contributions to the development of China's AI technology, and at the same time open up their own new careers.


     Machine learning technology has been widely applied and developed in foreign countries. In November 2015, Google open-sourced its new TensorFlow machine learning system, which is faster, smarter, and more resilient. In January 2015, the machine learning platform GraphLab changed its name to Dato and received $18.5 million in new financing (from Vulcan Capital, Opus Capital, New Enterprise Associates, Madrona Venture Group), and they had previously received $6.8 million in financing . In August 2015, Facebook launched "M". Facebook believes that humans will not only answer questions that cannot be answered by artificial intelligence, but that humans will also help improve artificial intelligence technology in the long run. In addition to being able to answer questions, "M" In addition to basic functions such as , viewing information, etc., it can also help users to complete operations such as purchasing goods, locating restaurants, and arranging travel plans. At the "Neural Information Processing Systems 2015" (NIPS) conference held in December 2015, Microsoft researchers and engineers published more than 20 papers on new machine learning research results. In addition, Microsoft announced that machine learning is becoming part of Windows 10: Skype Translator can translate spoken language into other languages ​​in near real-time, much like the Universal Translator in Star Trek, enabling face-to-face communication. The Cortana personal digital assistant continuously learns and improves as it interacts with users, helping users manage their calendars, track deliveries, and even chat and tell jokes with users for a truly personalized interactive experience. Clutter, a member of Microsoft Office 2016, keeps users' inboxes clutter-free by learning it to identify which emails are important to users and automatically redirect unimportant emails to a separate folder. In September 2015, Major General Steve Jones, commander of the U.S. Army Medical Center, spoke at a meeting of the U.S. Army and stated that in the future, intelligent robots can replace humans to transport the wounded on the battlefield. It may not be a human, but a robot, because the army of intelligent robots will go out in place of human beings.


       At the end of this section, I also look ahead to the future of artificial intelligence:

In the near future, human beings may think: In the future world, what role will robots play, and will they replace humans? How should humans and intelligent machines get along?

Humans have begun to study how to better realize the following three laws:

First, robots cannot harm people;

Second, robots must obey human orders;

Third, robots can protect themselves without violating the above principles.


2. Artificial Intelligence and Big Data


The era of big data has arrived, and this has become a fact. But with the high popularity of hadoop , spark and storm , big data technology has become a common field of software development and programming , the technical threshold is getting lower and lower , more and more people are pouring into this industry , history has proved that there is no long - term development of technology The field must be short-lived, will big data technology do the same? The answer is: no! Because of the participation of artificial intelligence, big data is based on the application of data analysis and mining technology, and is a combination of machine learning algorithms and data access technology. The access mechanism realizes efficient reading and writing of data.


                                        


Hint: popularize two concepts here:

1. Data mining is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in vast amounts of data.

2. Data analysis refers to the use of appropriate statistical methods to analyze a large number of first-hand and second-hand data collected, in order to maximize the development of data functions and play the role of data, it is to extract useful information. Whether it is data analysis or data mining, it helps people collect and analyze data, turn it into information, and make judgments.

3. Artificial Intelligence and Pattern Recognition

  Programmers, artificial intelligence is not a programming technology, nor is it data structures and algorithms. Artificial intelligence is a combination of programming, data structures, algorithms and mathematics.

  我们开始来谈模式识别,它是起源于工程领域,而机器学习起源于计算机科学,这两个不同学科的结合带来了模式识别领域的调整和发展。模式识别研究主要集中在两个方面:一是研究生物体(包括人)是如何感知对象的,属于认识科学的范畴;二是在给定的任务下,如何用计算机实现模式识别的理论和方法,这些是机器学习的长项,也是机器学习研究的内容之一。

 模式识别的应用领域广泛,包括计算机视觉、医学图像分析、光学文字识别、自然语言处理、语音识别、手写识别、生物特征识别、文件分类、搜索引擎等,而这些领域也正是机器学习的大展身手的舞台。


4 . 数学来了!危机来了

      咱们前面一直在谈机会,下面该谈危机了,大伙从第3节就可以看出危机了,模式识别都来了~
      人工智能和机器学习是美好的,程序们,进军AI领域吧!准备好了吗?!真的准备好了吗?那我们继续。。
      人工智能和机器学习的入门门槛较高,尤其对研究者的数学理解能力有较高要求,你肯定有话说,我懂数据结构、计算机算法,我做软件开发很多年了,我都是系统架构师了,都是项目管理师了,各种高端~
       但我告诉你们,机器学习是一个全新的领域,也是一个全新的高度,理解机器学习算法往往要从理解它涉及的数学公式和数学知识开始,所有的AI从业者都必须通过攀登数学这座大山一步步走入机器学习领域,打好数学基础是非常必要的,一旦你掌握了数学分析、线性代数、概率与统计、统计学、离散数学、抽象代数、数学建模等数学理论后,理解机器学习算法就容易多了,就不会畏惧那些让人生厌和烦琐的数学符号和数学公式,反而会喜欢上这些数学公式,并尝试亲自推导一番。
      掌握机器学习算法至少需要以下几种数学的基本知识。

1) 微积分

微积分的诞生是继欧几里得几何体系建立之后的一项重要理论,它的产生和发展被誉为“近代技术文明产生的关键之一,它引入了若干极其成功的、对以后许多数学的发展起决定性作用的思想”。微积分学在科学、经济学和工程学领域有广泛的应用,解决了仅依靠代数学不能有效解决的问题。微积分学建立在代数学、三角学和解析几何学的基础上,包括微分学、积分学两大分支,包括连续、极限、多元函数的微积分、高斯定理等内容。

微积分在天文学、力学、化学、生物学、工程学、经济学、计算机科学等领域有着越来越广泛的应用,比如:在医疗领域,微积分能计算血管最优支角,将血流最大化;在经济学中,微积分可以通过计算边际成本和边际利润来确定最大收益;微积分可用于寻找方程的近似值;通过微积分解微分方程,计算相关的应用,比如,宇宙飞船利用欧拉方法来求得零重力环境下的近似曲线等。

在机器学习和数据分析领域,微积分是很多算法的理论基础,如:多层感知器神经网络算法。多层感知器是一种前馈人工神经网络模型,算法分为两个阶段:正向传播信号、反向传播误差。

正向传播信号阶段是对样本的学习阶段,输入的信息从输入层传入,经各个隐层计算后传至输出层,计算每个单元的实际值,向各层各单元分摊产生的误差;反向传播误差阶段通过网络输出与目标输出的误差对网络进行修改审查,将正向输出的误差再传播回各层进行权重值调整,直到误差最小化或达到规定的计算次数。微积分理论在多层感知器模型中运用较多。


2)线性代数

线性代数是高等数学中的一门成熟的基础学科,它内容广泛,不但包含行列式、矩阵、线性方程组等初等部分,而且包括线性空间、欧式空间、酉空间、线性变换和线性函数、λ-矩阵、矩阵特征值等更深入的理论,线性代数在数学、物理学、社会科学、工程学等领域也有广泛的应用。

线性代数理论是计算技术的基础,在机器学习、数据分析、数学建模领域有着重要的地位,这些领域往往需要应用线性方程组、矩阵、行列式等理论,并通过计算机完成计算。

         

3) 概率论

概率论是研究随机性或不确定性现象的数学,用来模拟实验在同一环境下会产生不同结果的情况。下面这些概率理论是概率论的基础。

 (1)古典概率

 (2)条件概率

 (3)概率公理

公理1:0≤P(A)≤1(A∈S)

公理2:P(S)=1

公理3:P(A∪B)=P(A)+P(B),如果A∩B=0

(4)概率分布

包括二项分布、几何分布、伯努利分布、泊松分布、均匀分布、正态分布、指数分布等。样本空间随机变量的概率分布可用累积分布函数和概率密度函数进行分析。

概率论在机器学习和数据分析领域有举足轻重的地位,比如马尔可夫链理论。马尔可夫链对于现实世界的很多现象都给出了解释,泊松过程是连续时间离散状态的马尔可夫链,布朗运动是连续时间连续状态的马尔可夫链等。

马尔可夫链在计算数学、金融经济、机器学习、数据分析等领域都有重要的应用,马尔可夫链是数学中具有马尔可夫性质的离散时间随机过程,在给定当前知识或信息的情况下,仅使用当前的状态预测将来,在马尔可夫链的每一步,系统根据概率分布,从一个状态变到另一个状态或保持当前状态。

4)统计学

统计学是收集、分析、表述和解释数据的科学,作为数据分析的一种有效工具,统计方法已广泛应用于社会科学和自然科学的各个领域。统计学与概率论联系紧密,前者以后者为理论基础。统计学主要分为描述统计学和推断统计学。描述统计学描绘或总结观察量的集中和离散情形,基础的数学描述包括了平均数和标准差等;推断统计学将资料中的数据模型化,计算它的机率并且做出对于母群体的推论,主要包括假设检定、对于数字特征量的估计、对于未来观察的预测、相关性预测、回归、变异数分析、时间序列、数据挖掘等。

无论是描述统计学还是推断统计学都是数据分析技术的基础。通过描述统计学方法,数据分析专家能对数据资料进行图像化处理,将资料摘要变为图表,分析数据分布特征。此外,还可以分析数据资料,以了解各变量内的观察值集中与分散的情况等。通过推断统计学方法,对数据未知特征做出以概率形式表述的推断,在随机抽样的基础上推论有关总体数量特征。

5) 离散数学

离散数学是数学的几个分支的总称,研究基于离散空间而不是连续的数学结构,其研究内容非常广泛,主要包括数理逻辑、集合论、信息论、数论、组合数学、图论、抽象代数、理论计算机科学、拓扑学、运筹学、博弈论、决策论等。

离散数学广泛应用于机器学习、算法设计、信息安全、数据分析等领域,比如:数理逻辑和集合论是专家系统的基础,专家系统是一类具有专门知识和经验的计算机智能程序系统,一般采用人工智能中的知识表示和知识推理技术,模拟通常由领域专家才能解决的复杂问题;信息论、数论、抽象代数用于信息安全领域;与信息论密切相关的编码理论可用来设计高效可靠的数据传输和数据储存方法;数论在密码学和密码分析中有广泛应用,现代密码学的DES、RSA等算法技术(包括因子分解、离散对数、素数测试等)依赖于数论、抽象代数理论基础;运筹学、博弈论、决策论为解决很多经济、金融和其他数据分析领域的问题提供了实用方法,这些问题包括资源合理分配、风险防控、决策评估、商品供求分析等。

以上是机器学习需要的核心数学知识,但不是全部知识。随着今后人类对机器学习的深入研究,将有更多的数学分支进入机器学习领域。因此,仅掌握大学数学知识是不够的,还需要向更高层次进军,对于非数学专业毕业的朋友来说,还应该学习其他数学分支理论,比如说泛函分析、复变函数、偏微分方程、抽象代数、约束优化、模糊数学、数值计算等。

建议各位购买以下数学书籍,随时翻阅参考。

Finney,Weir,Giordano.《托马斯微积分》.叶其孝,王耀东,唐兢译.第10版. 北京:高等教育出版社 2003-1

Steven J.Leon.《线性代数》.张文博,张丽静译.第8版.北京:机械工业出版社

William Mendenhall等.《统计学》. 梁冯珍,关静译.第5版.北京:机械工业出版社

Dimitri P. Bertsekas等.《概率导论》.郑忠国,童行伟译.第2版.北京:人民邮电出版社

Kenneth H.Rosen等.《离散数学及其应用》.袁崇义,屈婉玲,张桂芸译.第6版.北京:机械工业出版社

Eberhard Zeidler等.《数学指南:实用数学手册》.李文林译.北京:科学出版社

它们都是机器学习所涉及的经典数学书,可以考虑将它们和《设计模式》、《算法导论》、《深入理解计算机系统》等经典算法书放在一起,作为案头必备书。

      历数这么多,只是人工智能需要懂得的数学的冰山一角。危机是来了,可是程序员们,为了AI,还是拼一把,好好学习数学吧~

一位MIT的牛人在BLOG中曾提到,数学似乎总是不够的,为了解决和研究工程中的一些问题,不得不在工作后,重新回到图书馆捧起了数学教科书。他深深感到,从大学到工作,课堂上学的和自学的数学其实不算少了,可是在机器学习领域总是发现需要补充新的数学知识。看来,要精通机器学习知识,必须在数学领域学习、学习、再学习,这一切都是很艰苦的。要学好机器学习必须做好艰苦奋斗的准备,坚持对数学知识的追求。

       


5.程序员的看家本领出场了


       既然是程序员进军AI,那自然有程序员的优势。以下技术是建议程序员进军AI时,需要学习的技术(这些不需要解释,所以只提供列表,如果看不懂。。。。那无语。。。):
        (1)    python
        (2)    R
        (3)    SQL
        (4)    linux shell和基本运维知识
        (5)    java(激动吧,到处都有它的影子,它能出现在列表中,应感谢HADOOP和SPARK还有STORM)  
      (6) scala
        接下来是架构:
         1.hadoop
         2.spark
         3.storm
         4.tensorflow,paddlepaddle,caffe

6.AI的核心

       前面说了这么多,估计各位是有压力了。接下来的压力会更大~
       AI全名人工智能,它需要使用编程、算法、数据结构、数学,更需要一个机器学习思维方式。
       AI的核心在于以下三点:
       (1)特征提取,比如,对一堆苹果进行分类,特征包括,重量,形状,生长期等等。
       (2)模型与算法选择,比如:SVM、神经网络、决策树、语义树、知识库、各种视觉算法等等。
       (3)机器学习网络构造,比如:深度学习虽然可以自动提取特征,但仍需要对其参数、具体网络结构进行定义和训练。

7.结束语

      咱们来轻松一下吧,AI作为一个发展空间很大,发展周期也很长的技术,确实是程序员的最佳进军地点,但是难度也很大,是的,不可否认,有框架,有现成的库,作为一个调参党挺好,如果这样,那在AI领域肯定不能长久走下去。你必须理解这些算法的原理,有空应亲自实现这些算法,加深理解,否则你无法完成参数良好的AI库调用,简单的调库是不行的~
       机会和危机总是并存的,祝各位早日进入机器学习和人工智能的大门。
       仅以此文与各位共同学习,共同进步,不当之处请谅解。

8.最后的甜点

       不是结束了嘛~嘻嘻,不不,还有一个甜点,一个AI从业者的发展路线,让各位朋友可以轻松地进入人工智能行业
       第o步,为什么说是第0步,因为这一步只是需要熟悉和有使用经验,不需要精通~那么这一步的是什么?就是大数据码农(AI时代的码农,领域技术更新快,也是普通程序员最容易转行AI的领域,但是也是最容易被高速发展的技术淘汰的领域~):

       学习hadoop搭建与基础、zookeeper、mapreduce、hbase、hive、flume、kafka、spark dataframe、spark sql、spark streaming、scala基础。
       Why is it easy to be eliminated if you only stay at step 0, because hadoop used to be popular, but in less than 2 years, spark has become popular, and now, tensorflow and paddlepaddle have become very popular.
       Then, when you are trying to learn spark, hadoop3.0 has been out of beta, claiming that the computing speed will exceed spark.
       The speed of technology update is already in units of years~
       The first step is to learn data structures and algorithms, and use programs to call AI libraries.
       The second step is to use the big data access platform to access data and call the AI ​​library.
       The third step is to learn to adjust the parameters and call the AI ​​library.
       The fourth step is to study mathematics hard, or call the AI ​​library.
       The fifth step is to learn the machine learning algorithm, or call the AI ​​library.
       The sixth step is to call the AI ​​library, haha, the difference is that you need to know the principle and core points of the algorithm, understand what kind of data and what kind of algorithm to use in the scene, and understand how to adjust the parameters, so that the AI ​​library can get the best results Effect.
       The seventh step is to analyze the AI ​​library this time, so as to make better use of the AI ​​library. The way is to program the machine learning algorithm by yourself, understand the internal implementation of the AI ​​library, and directly write the AI ​​algorithm to work with the existing AI library when necessary. .
       The eighth step is to learn mathematics more deeply to meet the continuous development and challenges of AI technology.
       The ninth step, . . . . After the robot comes out, study the three laws.
       
       Let me share with you the experience of improving the accuracy of machine learning algorithms:
         First, change algorithm

         Second, optimize feature extraction

         Third, adjust parameters
         
        Fourth, optimization of sample data and optimization of output results

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324968036&siteId=291194637