Make machine learning no longer an inaccessibility, and take you to explain machine learning in detail (one of the machine learning Machine Learning seminars)

What is Machine Learning?

Machine learning is] the field of study that gives computers the ability to learn without being explicitly programmed. — Arthur Samuel, 1959

A computer program can learn T and some performance measure P from experience E about some task, if its performance on T, as measured by P, improves with experience E. —Tom Mitchell, 1997

In layman's terms, 机器学习(Machine Learning) is the science of computer programming that can learn from data.

机器学习Like a spam filter in email, the user marks examples of spam and regular mail, and then lets the machine learn to flag spam.

Those examples marked by the user for learning are called 训练(training), and each training example becomes 训练实例(or called 训练样本), 机器学习and the process and makes 预测(predictions) become 模型(Model).

神经网络(Neural networks) and 随机森林(random forests) are also 模型examples.

In this case, the task T is to mark new emails as spam, the experience E is the training data, and a performance measure P needs to be defined; for example, you can use the ratio of correctly classified emails. This particular measure of performance is called high accuracy and is often used for classification tasks.

But if you just download a copy of all the Wikipedia articles, you have more content data on your computer, but it's not suddenly better at any task. This is not 机器学习.

Why use machine learning?

We discard the filters 机器学习that use 传统编程the way to deal with spam.

  • 1. First, you have to check 垃圾邮件what it looks like. For example, a mall event promotion, real estate information, stock recommendation, etc.
  • 2. Yes, you have to write a set of detection algorithms for the above situations.
  • 3. You will test your program and repeat steps 1 and 2 until it is good enough.

In the end your program is likely to become a long list of complex rules - difficult to maintain.

In contrast, 机器学习spam filters based on techniques that automatically learn to predict spam well by detecting anomalous frequencies are shorter, easier to maintain, and likely more accurate than spam examples of.

What if 垃圾邮件the sender finds out that all emails containing promotional links for marketplace events are blocked? They are bound to replace the link with a short link.

However, with traditionally programmed spam filters, newer techniques are needed to flag emails 短链接that promote hidden marketplace campaigns. As long as 垃圾邮件the sender continues to work around the spam filter, you are forever in the dark to update your rules.

In contrast, 机器学习technology-based 垃圾邮件filters automatically notice that store promotion emails in the form of short links become unusually frequent in user-flagged spam, and flag them without your intervention.

机器学习The highlight is to solve problems that are too complex for traditional methods, or that do not yet have algorithms.

Let's 语音识别talk about it! If you want to start simple and write a program that can distinguish between one and two, perhaps you can hardcode an algorithm to detect the intensity of high and low sounds and use that to distinguish.

Obviously, this hard-coding technique cannot be extended to thousands of Chinese spoken by hundreds of millions of people.

Different people use dozens of languages ​​in a noisy environment. The best solution so far is to write an 自我学习algorithm that can, given many examples of text recordings, and eventually let it 机器学习learn like a human.

All in all, 机器学习it is very suitable for the following scenarios:

  • Existing solutions require a lot of fine-tuning or long lists of problem rules (machine learning models can often simplify code and perform better than traditional methods)
  • Complex problems that cannot be solved using traditional methods (the best machine learning techniques may be able to find a solution)
  • Fluctuating environments (ML systems can easily be retrained on new data, always up to date)
  • Gain insight into complex problems and large volumes of data

Guess you like

Origin blog.csdn.net/coco2d_x2014/article/details/131741482