Before learning machine learning you should know some of the things

Introduction

In the past few years, the people of machine learning to generate new interest. This recovery seems to be driven by strong fundamental factors - terminals around the world are in the release of large amounts of data, and these data are very low cost, calculate the cost of the lowest ever!

However, not everyone understands what machine learning Yes. Here are a few examples:

  • What is Machine Learning? How is it different big data and business analytics?

  • Machine learning, data analysis, data mining, data What is the difference between science and AI?

We recently posted an interesting (but very real) theme.

 

He said that, taking into account the degree of confusion on this topic, we are going to write an introductory article about machine learning. The idea is to remove all possible terms in the past to scare people, to create something that can be easily understood what 5-year-olds (emmmmmmm ............ Well, I'm sorry, you may need high school)!

 

Machine learning what is? One of my little experiment ......

To ensure that I would not overestimate (or underestimate) the ability to target audiences, I found 10 pairs analysis of complete strangers. They have not heard before machine learning (yes, there really such people !!!!!). They had this to say:

  • I do not know, may be to learn from the machine?

  • The machine learning something, that is, programming the machine software

  • With computer help me learn?

  • Online courses (!!!)

It's interesting! Perfect explain what they thought of machine learning. Here is what I explain to these people learning machine concepts:

  • Machine learning refers to the most intelligent way large amounts of data (by developing algorithms) operable to obtain insight in the art.

Then, they looked at me as if I spoke to them as Martians! So, I stopped silly term to explain, and then in turn, ask them questions, they can facilitate a better understanding of:

  • KJ : When you search for something in Google, what do you think will happen?

  • Crew : Google displays the web page associated with the search.

  • KJ : That's good! But what makes Google can display these related pages to How about you?

This time it looks like they want more. Then some people in the group began to speak

  • Crew : Google will view the user clicks the past, learn which pages are more relevant to these searches, and then provide the results on the search results.

This is a good attempt. But I also have to control his impulses, Google tells them to do this more complicated than they are this simple concept. But I think I have a better way to explain the machine learning. So, I went on to say:

  • KJ : Well, that sounds good. However, Google will be how many searches as well as all types of regular search process?

  • Members : This must be a very large number - perhaps a trillion times per year search

  • KJ : Well, how do you think Google and accurately to meet so many requests? Some people think you are not sitting in office and continue to deal with Google search results what is the problem with the search is related to it?

  • Panel members : I have not thought about, but no one to deal with this, because it seems to sound like a human can handle.

  • KJ : You're right. This is where the role of machine learning. Machine learning is a set of technologies in the most intelligent way to handle large amounts of data (by developing an algorithm or a set of logical rules) to get actionable results (in what we are discussing is to provide users with search).

This time members of the Group in accordance with the expected nodded his head, looks like I have done it ...... yeah! But where do I always feel ...

There are some common problems - such as machine learning and X What is the difference?

You start learning about machine learning that moment, you will see all kinds of knowledge like a rocket as high-speed bombing you. These are the terms used in the industry more. The following are some of them: artificial intelligence, deep learning, data mining and statistics.

To give you more clearly understand, I explain these terms in a simple manner. You will also learn the importance of these terms in the Machine Learning:

What is Artificial Intelligence (AI):

It refers to a computer (machine) program makes its own justify the program. Ah! What is the reason? Reason is the basis for making decisions.

I mentioned the "rational" rather than intellect (as expected), because we humans tend to make a highly rational and practical decision-making rather than explicitly wisdom. This is because not all smart decisions need a rational and viable (my assumption). Therefore, the core motivation behind the use of artificial intelligence is a fashionable way to achieve computer (machine) behavior, rather than guided by human folly!

AI may comprise a program to check whether certain parameters of the running program. For example, if the parameter say "X" exceeds a certain threshold, the machine may issue an alert, and the threshold value may in turn would affect the results of the relevant processes.

Application of Artificial Intelligence in Machine Learning

Machine learning is a subset of artificial intelligence, which trained the machine, we can learn from past experience. Past experience is developed through data collection. Then it combines Naive Bayes and support vector machine algorithm to provide the final result.

What are the statistics:

In this high-level stage, I assume you already know the statistics. If not, here's a quick overview lets you define statistics, statistics is a branch of mathematics, which uses data, or the data for the entire population, or to extract a sample from the population, to analyze and give inference. There are statistical techniques used regression, variance, standard deviation, conditional probability and so on.

Use of statistics in Machine Learning

Let us understand this, we first need to assume that I need to e-mail inbox is divided into two categories: "Spam" and "important messages." In order to identify spam, I can use called naive Bayes machine learning algorithm checks the frequency of past spam, so new messages identified as spam. Naive Bayesian statistical techniques using Bayes' theorem (commonly referred to as conditional probability). Therefore, we can say that machine learning algorithms use statistical concepts to perform machine learning.

PS: The main difference between machine learning and statistical models from their birthplace. Machine Learning originated in the Department of Computer Science, Department of Mathematics from statistical modeling. In addition, any statistical modeling assumes that many distributions, and machine learning algorithms often do not know the distribution of all properties.

What is the depth of learning:

The algorithm uses the concept of depth study of the human brain and machine learning algorithms (artificial neural networks, ANN) associated with the promotion of modeling any function. Neural networks require a large amount of data, the algorithm has a high degree of flexibility and at the same time when a plurality of model outputs. A neural network is a more complex subject, we can be discussed in a completely separate article.

What is data mining:

I just started doing data analyst days, I always used to confuse the two terms: machine learning and data mining. But then I learned that data mining is the process of searching for specific information. Machine learning to focus on the completion of a specific task. Let me give an example to help me remember the difference; teach others how to dance is machine learning. Use someone looking for the best dance center in the city is data mining. It is not a super simple!

But how do we actually teach the machine?

Teaching machine involves a structured process, this process, each stage can build a better machine version. For simplicity, the process of the teaching machine can be divided into three parts:

 

I will describe each of these three steps in detail in subsequent articles. By now, you should understand that these three steps to ensure the overall learning machine can perform a given task as important places. The success of the machine depends on two factors:

  1. How generalization effect abstract data.

  2. This machine learning how to use it to predict the future in the practical application.

What steps is machine learning?

There are five basic steps for performing machine learning tasks:

  1. Data collection : the original data, whether from excel, access, text files, etc., this step (to collect past data) form the basis for future learning. The type of data, the more density and number, the better the machine learning prospects.

  2. Preparation of data : any analytical procedure used will depend on how the data quality. It takes time to determine the quality of the data, and then take measures to solve the problem of data and processing outliers such as missing. Exploratory analysis of data may be a detailed study of the nuances of the method, so that the quality of data rapidly.

  3. Training model : This step involves the selection of appropriate algorithms and data in the form of representation of the model. Data after cleaning divided into two parts - Training and Testing (ratio determined depending on the premise); a first portion (training data) for the development model. A second portion (test data) used as a reference.

  4. Model Evaluation : To test the accuracy of the data using the second portion (holding / test data). This step determines the accuracy of the results according to the selected algorithm. Better test to check the accuracy of the model is to look at the performance of its fundamental unused during model building data.

  5. 提高性能:此步骤可能涉及选择完全不同的模型或引入更多变量来提高效率。这就是为什么需要花费大量时间进行数据收集和准备的原因。

无论是任何模型,这5个步骤都可用于构建技术,当我们讨论算法时,您将找到这五个步骤如何出现在每个模型中!

机器学习算法有哪些类型?

 

监督学习/预测模型:

顾名思义,预测模型用于根据历史数据预测未来结果。预测模型通常从一开始就给出明确的指示,如需要学习的内容以及如何学习。这类学习算法被称为监督学习。

例如:当营销公司试图找出哪些客户可能会流失时,就会使用监督学习。我们还可以用它来预测地震,龙卷风等危险发生的可能性,目的是确定总保险价值。使用的算法的一些示例是:最近邻算法,朴素贝叶斯算法,决策树算法,回归算法等。

无监督学习/描述性模型:

它用于训练描述模型,其中没有设置目标,并且没有一个特征比另一个重要。无监督学习的情况可以是:当零售商希望找出产品组合时,顾客往往会更频繁地购买。此外,在制药工业中,可以使用无监督学习来预测哪些疾病可能与糖尿病一起发生。这里使用的算法示例是:K-均值聚类算法

强化学习(RL):

这是机器学习的一个例子,其中机器被训练根据业务需求做出特定的决定,唯一的座右铭是最大化效率(性能)。强化学习所涉及的理念是:机器/软件代理根据其所处的环境不断地自我训练,并应用它丰富的知识来解决业务问题。这种持续的学习过程可以减少人类专业知识的参与,从而节省大量时间!

RL中使用的算法的示例是马尔可夫决策过程。

PS:监督学习和强化学习(RL)之间存在细微差别。RL主要涉及通过与环境交互来学习。RL代理从其过去的经验中学习,而不是从其持续的试验和错误学习过程中学习,而是外部主管提供示例的监督学习中学习。

了解差异的一个很好的例子是无人驾驶汽车。自驾车使用强化学习来不断做出决策 - 走哪条路?速度是是多少?这些问题都是与环境互动后决定的。监督学习的一个简单表现是预测出租车从一个地方到另一个地方的车费。

机器学习有哪些应用?

了解机器学习的应用是非常有趣的。Google和Facebook广泛使用ML将其各自的广告推送给相关用户。以下是你应该了解的一些ML应用:

  • 银行和金融服务:ML可用于预测可能违约支付贷款或信用卡账单的客户。这是至关重要的,因为机器学习将帮助银行识别那些是可以获得贷款和信用卡的客户。

  • 医疗保健:它用于根据患者的症状诊断致命疾病(例如癌症),并根据类似患者的过去数据对其进行统计。

  • 零售:它用于识别销售频繁(快速移动)的产品和缓慢移动的产品,帮助零售商决定从货架上引入或移除哪种产品。此外,机器学习算法可用于查找哪两个/三个或更多产品一起销售。这样做是为了设计客户忠诚度计划,从而帮助零售商开发和维护忠诚的客户。

这些例子只是冰山一角。机器学习在每个领域都有广泛的应用。可以查看一些Kaggle问题以获得更多知识,上面包含的例子很容易理解,至少可以体验机器学习的无所不能。

随着人工智能的热潮,人们开始逐渐的对机器学习产生了兴趣,而这种兴趣也是全球化,虽然人们对机器学习有很大的兴趣,但是人们对机器学习似乎并没有真正的了解,而文章的作者借由向一些非数据科学行业内的小白科普机器学习的过程中,用非常白话的语言向我们介绍了什么是机器学习,一些机器学习中的专业术语,机器学习的步骤和机器学习的类型与应用。并且通过一些小案例向我们解释了各种算法的作用,在我认为,机器学习是进入人工智能领域一块很好的垫脚石,至少不会再未来的浪潮中使我们迷失了方向。

文章翻译自:Machine Learning basics for a newbie

原文链接:https://www.analyticsvidhya.com/blog/2015/06/machine-learning-basics/

 

Guess you like

Origin www.cnblogs.com/juanjiang/p/11112668.html