Deep learning and deep learning framework optimization method Introduction

1, the depth of learning Optimization

(1) SGD stochastic gradient descent

The batch version of the gradient descent Delta] X T = - ηg T

Pros: When too much training data, batch method can reduce the pressure of the machine, faster convergence; training redundancy, batch method converges faster.

Cons: Updates completely dependent on the direction of the current batch, update unstable.

(2) Momentum MOMENTUMMETHOD

The momentum Delta] X T = [rho] Delta] X T. 1- -ηg T

Updated direction before retain a certain extent, while taking advantage of the current batch of fine-tuning the gradient direction.

Advantages: increased stability to some extent, learning faster; and there must escape from local optimum capacity.

(3)Nesterov Momentum

Improve the traditional MOMENTUMMETHOD

ΔX t=ρΔX t-1-ηΔf(X t+ρΔX t-1)

(4) Adagrad

Learning rate constraint

a = a / S t i = 1 (g i ) 2 + e

Suitable for handling sparse gradient

(5) Adadelta

The extension of Adagrad

Simplify the calculation, the accumulated fixed-size entries (approximated)

(6) RMSprop

Adadelta special case

Rms the RMS | G | T = √E | G 2 | T + [epsilon]

ΔX t=-(η/RMS|g| t)g t
For non-stationary target, the better the effect on RNN

 

(7)Adam

Essentially term momentum is with RMSprop

With a gradient first moment and second moment estimation estimation learning rate dynamically adjusting each parameter, after the offset correction, after each iteration of the learning rate has a certain range, so that more stable parameters.

(8) The coordinate descent method

A method of non-gradient optimization, it searches along a linear coordinate direction at each iteration by using different recycling methods to coordinate the local minimum value of the objective function.

2, deep learning framework

 

PaddlePaddle

PaddlePaddle Baidu is developed by the open source and open deep learning platform, it is the first domestic open source, is currently the only fully functional depth learning platform. Relying on Baidu's long-term business scenarios temper, PaddlePaddle most comprehensive official support of the industrial application model, covering natural language processing, computer vision fields, recommendation engines, and open several leading Chinese model of pre-training, and more get algorithm model contest winner in an international context.
PaddlePaddle supports both dense and sparse parameters parameters scene of large scale parallel deep learning training, support 100 billion parameter, several hundreds of efficient parallel training, but also the first to provide such a powerful depth learning framework depth study of parallel technology. PaddlePaddle has a strong multi-terminal deployment capabilities, support for server, mobile terminal devices and other heterogeneous hardware high-speed inference, forecasting a significant performance advantage. Currently PaddlePaddle has achieved a stable and backward-compatible API, with perfect use bilingual document, the formation of easy to use, simple and efficient technical features.
PaddlePaddle 3.0 version upgrade to the full depth of learning development kit, in addition to the core framework, also opened VisualDL, PARL, AutoDL, EasyDL, AI Studio, a set of deep learning tool components and services platform to better meet the different levels of depth learning development needs of developers, have a strong ability to support industrial applications, has been widely used in Chinese enterprises, but also has a vibrant developer community ecology.

Tensorflow

Google open source Tensorflow is a C ++ language development using open source math software, used in the form of data flow diagrams (Data Flow Graph) is calculated. A mathematical operation node in the graph, the line represents and FIG interaction between the multidimensional data array (tensor). Tensorflow flexible architecture that can be deployed in one or more CPU, GPU desktop and server, or using a single API applications in a mobile device. Tensorflow was initially carried out by the team of researchers and Google Brain depth for machine learning and neural network research and development, it can be applied in almost all areas after open source.
Tensorflow is the largest number of users worldwide, the most massive of a community framework, because Google company produced, so maintenance and updates more frequently, and with Python and C ++ interfaces, tutorials are also very well, while the first version reproduction of many papers Tensorflow are based writing, it is deep learning community framework default boss.

Caffe

And Tensorflow fame as large as the depth learning framework Caffe, developed by the University of California, Berkeley Phd Jia Yang Qing, stands Convolutional Architecture for Fast Feature Embedding, is a clear and efficient open source depth learning framework by the Berkeley Vision Center (Berkeley Vision and Learning Center, BVLC) for maintenance.
From its name it can be seen particularly well for its support of the convolution of the network, but also written in C ++, providing a C ++ interface, but also provides an interface and matlab python interface.
Caffe is so popular, is because a lot before the game ImageNet which networks are used by Caffe write, so if you want to use the network model can only use these games Caffe, which also led to a lot of people go directly to the Caffe frame below.
Caffe drawback is not flexible enough, while the high memory usage, provides only C ++ interface, an upgraded version of Caffe Caffe2 already open, and fixes some issues, while the project level has been further improved.

Theano

Theano was born in 2008 in Montreal Institute of Technology, which has spawned a variety of deep learning Python packages, including the most famous Blocks and Keras. The core Theano is a mathematical expression compiler that knows how to get your structure, and make it efficient code that uses numpy, efficient local libraries, such as BLAS and native code (C ++) on the CPU or GPU as possible run fast. It is a learning process for the calculation of the required depth of large-scale neural network algorithm specifically designed, it is one of the first such library (development started in 2007), is considered to be depth study and research and development of industry standards.
But most of the developed Theano researchers to participate in the development of Google Tensorflow, so a certain degree, Tensorflow like Theano children.

MXNet

MXNet is lead author Li Mu, the first is the sheer enthusiasm of a few people holding the technology and development to make up, has become the official framework of the Amazon, has a very good support distributed, but particularly good performance, low memory footprint, At the same time it has not only developed language interface Python and C ++, as well as R, Matlab, Scala, JavaScript, etc., it can be said to meet people to use any language.
But the disadvantage is also obvious MXNet, the tutorial is not perfect, not many people use results in little communities, but there is little competition and papers are based MXNet achieve, which makes MXNet promotion efforts and awareness is not high every year.

Torch

Torch is a scientific computing framework to support a large number of machine learning algorithm, which was born ten years ago, but the real benefit from the potential of Facebook open source a lot Torch depth learning modules and extensions. Torch is characterized by particularly flexible, but the other is to use a special about the programming language Lua, under the deep learning most of the Python programming language environment, Lua programming language with a frame has more disadvantages, this is a minority language increases the cost of learning to use the framework of the Torch.

PyTorch

PyTorch's predecessor is the Torch, Torch and its underlying frameworks, but using Python to re-write a lot of content, not only more flexible and dynamic graphic support, but also provides a Python interface. It was developed by Torch7 team, is a priority Python depth learning framework, not only for powerful GPU acceleration, and also supports dynamic neural network, which is a lot of mainstream deep learning framework such as Tensorflow, etc. are not supported.
PyTorch both can be seen as adding GPU support of numpy, but can also be seen as a neural network has a strong depth automatic derivation function. In addition to Facebook, it has been adopted Twitter, CMU Salesforce and other institutions.

Guess you like

Origin www.cnblogs.com/zhenpengwang/p/11266206.html