(Turn) Why choose a machine learning strategy

Transferred from Wu Enda deeplearningai

 

Machine learning is the foundation of countless important applications, including web search, spam detection, speech recognition, and product recommendations. If you and your team are developing a machine learning application and want to make rapid progress, some of the content in this book will be helpful.

Let’s say you’re building a startup that will provide cat lovers with tons of pictures of cats. At the same time, you decide to use neural network technology to build a computer vision system to recognize cats in pictures.

Your team has many improvements, such as:

  • Get more data, i.e. collect more pictures of cats

  • Collect more diverse training datasets, such as pictures of cats in unusual locations, pictures of cats with strange colors, and pictures of cats taken with different camera parameters

  • Make the algorithm train longer by increasing the number of gradient descent iterations

  • Try a larger neural network with more layers/hidden units/parameters

  • Try adding regularization (e.g. L2 regularization)

  • Change the architecture of the neural network (activation function, number of hidden elements, etc.)

  • ...

 

If you can make the right choice among the above possible directions, then you will build a leading cat image recognition platform and lead your company to success. But if you choose a bad direction, you can waste months or even years of development time.

Supervised learning refers to the use of labeled training samples   to learn a  function that  maps from to   . Supervised learning algorithms mainly include linear regression (linear regression), logarithmic probability regression (logistic regression, also translated as logistic regression) and neural network (neural network). Although there are many forms of machine learning, most of the machine learning algorithms that have practical value today come from supervised learning.

I will often refer to Neural Networks (in line with "Deep Learning"), but you only need a basic understanding of this to read later.

 

If you are not familiar with some of the concepts mentioned above, you can watch the first three weeks of Machine Learning on Coursera. (course address: http://ml-class.org )

A lot of ideas about deep learning (neural networks) have been around for decades, so why are these ideas only becoming popular now?

There are two main factors driving the recent development:

  • Data availability : People are spending more and more time on digital devices (laptops, mobile devices, etc.), and digital behaviors and activities generate massive amounts of data that can be fed to our learning algorithms used for training.

  • Computational scale : It was only a few years ago that we started to be able to train large enough neural networks using the massive datasets available.

 

Specifically, the performance of older learning algorithms like logistic regression will "level out" even if you accumulate more data. This means that the learning curve of the algorithm will "flatten" and the performance of the algorithm will stop improving even if more data is provided.

Older learning algorithms don't seem to know how to handle today's scale of data.

 

如果你在相同的监督学习任务上选择训练出一个小型的神经网络(neutral network, NN),则可能会获得较好的性能表现.

因此,为了获得最佳的性能表现,你可以这样做:

(i) 训练大型的神经网络,效果如同上图的绿色曲线;

(ii) 拥有海量的数据。

在算法训练时,许多其它的细节也同等重要,例如神经网络的架构。但目前来说,提升算法性能的更加可靠的方法仍然是训练更大的网络以及获取更多的数据。

 

完成 (i) 和 (ii) 的过程异常复杂,本书将对其中的细节作进一步的讨论。我们将从传统学习算法与神经网络中都起作用的通用策略入手,循序渐进地讲解至最前沿的构建深度学习系统的策略。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326038914&siteId=291194637