Three frameworks of deep learning (comparison)

The wave of artificial intelligence is sweeping the world, and many words such as artificial intelligence, machine learning, deep learning, etc. linger in our ears all the time. The concept of "artificial intelligence" was proposed as early as 1956. As the name suggests, computers were used to construct complex machines with the same essential characteristics as human intelligence. After decades of development, after 2012, thanks to the increase in data volume, the increase in computing power and the emergence of machine learning algorithms (deep learning), artificial intelligence began to explode. However, the current scientific research work is focused on the part of weak artificial intelligence, that is, to allow machines to have the ability to observe and perceive, and to understand and reason to a certain degree. It is expected that some major breakthroughs can be made in this field. Most of the artificial intelligence in the movie is depicting strong artificial intelligence, that is, allowing the machine to acquire adaptive capabilities to solve some problems that have not been encountered before, and this part is difficult to truly realize in the current real world.

If there is hope for artificial intelligence to make breakthroughs, how did it achieve it, and where does "intelligence" come from? This is mainly due to a method of achieving artificial intelligence-machine learning.

1. The concept of machine learning

Machine learning is a method of achieving artificial intelligence.

The most basic method of machine learning is to use algorithms to analyze data and learn from it, and then make decisions and predictions about events in the real world. Unlike traditional software programs that are hard-coded to solve specific tasks, machine learning uses large amounts of data to "train" and learn how to complete tasks from the data through various algorithms. Machine learning originated from the early artificial intelligence field. Traditional algorithms include decision trees, clustering, Bayesian classification, support vector machines, EM, Adaboost, and so on. From the perspective of learning methods, machine learning algorithms can be divided into supervised learning (such as classification problems), unsupervised learning (such as clustering problems), semi-supervised learning, ensemble learning, deep learning, and reinforcement learning.

The application of traditional machine learning algorithms in the fields of fingerprint recognition, Haar-based face detection, and HoG feature-based object detection has basically reached the requirements of commercialization or the level of commercialization of specific scenarios, but every step forward is extremely difficult, until The emergence of deep learning algorithms.

Second, the concept of deep learning

Deep learning is a technology that implements machine learning.

It is not an independent learning method in itself, and supervised and unsupervised learning methods are also used to train deep neural networks. However, due to the rapid development of this field in recent years, some unique learning methods have been proposed (such as residual networks), so more and more people regard it as a learning method alone.

The original deep learning is a learning process that uses deep neural networks to solve feature expression. Deep neural network itself is not a new concept, but can be roughly understood as a neural network structure containing multiple hidden layers. In order to improve the training effect of deep neural networks, people make corresponding adjustments to the connection method and activation function of neurons. In fact, there were many ideas in the early years, but due to insufficient training data and backward calculation ability, the final results were not satisfactory. Deep learning, as the hottest machine learning method at present, does not mean that it is the end of machine learning. At least the following problems currently exist:

1. Deep learning models require a lot of training data to show magical effects. However, in real life, small sample problems are often encountered. At this time, deep learning methods cannot be used, and traditional machine learning methods can handle them;

2. In some fields, the traditional simple machine learning method can be used to solve it well, and there is no need to use complex deep learning methods;

3. The idea of ​​deep learning is inspired by the human brain, but it is by no means a simulation of the human brain.

Therefore, there is also a difference between machine learning frameworks and deep learning frameworks. Essentially, the machine learning framework covers various learning methods for classification, regression, clustering, anomaly detection and data preparation, and can also include neural network methods. The deep learning or deep neural network (DNN) framework covers various neural network topologies with many hidden layers, including the multi-step process of pattern recognition. The more layers in the network, the more complex the features that can be extracted for clustering and classification. Common Caffe, CNTK, DeepLearning4j, Keras, MXNet and TensorFlow are deep learning frameworks. Scikit-learning and Spark MLlib are machine learning frameworks. Theano crosses these two categories.

The rest of this article will focus on the three frameworks of deep learning, caffe, tensorflow, and keras. It is more appropriate if you only need to use traditional machine learning basic algorithms to use scikit-learning and spark MLlib.

Three, deep learning framework comparison

Neural networks generally include two major stages: training and testing. Training is the process of extracting model parameters from training data and neural network models (AlexNet, RNN, and other neural network training frameworks, Caffe, etc.) using CPU or GPU. The test is to run the test data with the trained model (neural network model + model parameters) and then view the results. Caffe, keras, and tensorflow unify and abstract the data involved in the training process to form a usable framework.

(1) Caffe

1. Concept

Caffe is a clear and efficient deep learning framework, as well as a widely used open source deep learning framework. Before Tensorflow appeared, it had always been the project with the most Github stars in the deep learning field. The main advantages are: easy to get started, the network structure is defined in the form of configuration files, and no code is required to design the network. The training speed is fast, the components are modular, and it can be easily extended to new models and learning tasks. However, the initial design goal of Caffe was only for images, and did not consider text, speech or time series data. Therefore, Caffe supports convolutional neural networks very well, but supports time series RNN, LSTM, etc., which are not particularly sufficient. There are many commonly used network models in the models folder of the Caffe project, such as Lenet, AlexNet, ZFNet, VGGNet, GoogleNet, ResNet, etc.

2. Caffe's module structure

Caffe abstracts the data in the network into Blob from low to high, each layer of the network is abstracted into Layer, the entire network is abstracted into Net, and the solution method of the network model is abstracted into Solver.

1. Blob represents the data in the network, including training data, the parameters of each layer of the network, and the data passed between the networks are realized through Blobs. At the same time, Blob data also supports storage on the CPU and GPU, which can be used in both Synchronize between.

2. Layer is an abstraction of various layers in neural networks, including convolutional layers and down-sampling layers, as well as fully connected layers and various activation function layers. At the same time, each Layer implements forward and backward propagation, and transmits data through Blob.

3. Net is a representation of the entire network, which is composed of various layers connected before and after, and is also a network model constructed.

4. Solver defines the solution method for the Net network model, records the network training process, saves the network model parameters, interrupts and restores the network training process. Custom Solver can realize different network solving methods.

3. Installation method

Caffe needs to install more dependencies in advance, such as CUDA, snappy, leveldb, gflags, glog, szip, lmdb, OpenCV, hdf5, BLAS, boost, ProtoBuffer, etc.;

Caffe official website: http://caffe.berkeleyvision.org/;

Caffe Github: https://github.com/BVLC/caffe; Caffe installation tutorial:

http://caffe.berkeleyvision.org/installation.html,

http://blog.csdn.net/yhaolpz/article/details/71375762;

Caffe installation is divided into CPU and GPU versions, GPU version requires graphics card support and installation of CUDA

4. Use Caffe to build a neural network

[Flow chart of building neural network in caffe]

In the above process, step 2 is the core operation, and it is also the most troublesome part of using caffe. Keras has made a higher level of abstraction on this part, allowing users to quickly write the model they want to implement.

(二) Tensorflow

1. Concept

TensorFlow is an open source software library that uses data flow graphs for numerical calculations. The nodes in the graph represent mathematical operations, and the edges of the graph represent the multi-dimensional data arrays (also called tensors) passed between nodes. The flexible architecture allows the use of a single API to deploy calculations to one or more CPUs or GPUs in a server or mobile device. The related concepts involved in Tensorflow are explained as follows:

1) Symbolic calculation

Symbolic calculation first defines various variables, and then establishes a "calculation diagram", which specifies the calculation relationship between various variables. Symbolic calculation is also called data flow diagram. The process is shown in Figure 2-1 below. The data flows according to the black line with arrows in the figure.

[2-1 Data flow diagram example]

The data flow graph uses a directed graph of "nodes" and "edges" to describe mathematical calculations.

① "Node" is generally used to represent the applied mathematical operation, but it can also represent the start point of data input (feed in)/the end point of output (push out), or the end point of reading/writing a persistent variable (persistent variable).

② "Line" represents the input/output relationship between "nodes".

③ The multidimensional data array flowing on the line is called a "tensor".

2) Tensor

A tensor can be regarded as a natural extension of vectors and matrices, used to represent a wide range of data types, and the order of a tensor is also called dimension.

A tensor of order 0, that is, a scalar, is a number. The order 1 tensor, that is, a vector, is a set of ordered numbers. A second-order tensor, that is, a matrix, is a set of vectors arranged in an orderly manner. The third-order tensor, or cube, is a set of matrices arranged up and down. And so on.

3) Data format (data_format)

There are currently two main ways to represent tensors:

① th mode or channels_first mode, Theano and caffe use this mode.

② tf mode or channels_last mode, TensorFlow uses this mode.

To illustrate the difference between the two modes: For 100 16×32 (high 16 and 32 wide) color images with RGB3 channels, th representation method: (100,3,16,32) tf representation method: (100,16, 32,3) The only difference is that the position of the channel number 3 is different.

2. Tensorflow's module structure

The Tensorflow/core directory contains the TF core module code, the specific structure is shown in Figure 2-2:

[Figure 2-2 tensorflow code module structure]

3. Installation method

1. Download naconda installation from the official website: https://www.anaconda.com/download/;

2. In the Anaconda Prompt console one by one, enter the instructions in the following 5 steps to install:

  1. Install py3+ cmd: conda create -n py3.6 python=3.6 anaconda;

  2. Activate the virtual environment cmd: activate py3.6

  3. Activate TSF pre-installation cmd: conda create -n tensorflow python=3.6; activate tensorflow;

  4. 安装TSF:pip install —ignore-installed —upgrade tensorflow;pip install —ignore-installed —upgrade tensorflow-gpu;

  5. Exit the virtual environment cmd: deactivate py3.6.

4. Use Tensorflow to build a neural network

Using Tensorflow to build a neural network mainly includes the following 6 steps:

  1. Define the function of adding a neural layer;

  2. Prepare training data;

  3. Define the node to be ready to receive data;

  4. Define the neural layer: hidden layer and prediction layer;

  5. Define loss expression;

  6. Choose optimizer to minimize loss;

  7. All variables are initialized, through sess.run optimizer, iterative learning is performed multiple times.

5. Sample code

Tensorflow builds a neural network to recognize handwritten digits. The specific code is as follows:

importtensorflow as tfimportnumpy as np# 添加层defadd_layer(inputs, in_size, out_size, activation_function=None):   # add one more layer andreturnthe output ofthislayer   Weights= tf.Variable(tf.random_normal([in_size, out_size]))   biases = tf.Variable(tf.zeros([1, out_size]) +0.1)   Wx_plus_b = tf.matmul(inputs, Weights) + biases

ifactivation_function is None:       outputs = Wx_plus_b

else:       outputs = activation_function(Wx_plus_b)

returnoutputs#1.Training data# Make up some real data x_data = np.linspace(-1,1,300)[:, np.newaxis]noise = np.random.normal(0,0.05, x_data.shape)y_data = np .square(x_data) -0.5+ noise#2. Define the node to prepare to receive data# define placeholderforinputs to network xs = tf.placeholder(tf.float32, [None,1])ys = tf.placeholder(tf.float32, [None ,1])#3. Define neural layer: hidden layer and prediction layer# add hidden layer The input value is xs, there are 10 neurons in the hidden layer l1 = add_layer(xs,1,10, activation_function=tf.nn.relu )# add output layer The input value is the hidden layer l1, and 1 result is output in the prediction layer prediction = add_layer(l1,10,1, activation_function=None)#4. Define loss expression# the error between prediciton and real data loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys-prediction), reduction_indices=[1]))#5. Select optimizer to make loss reach the minimum# This line defines the way to reduce loss, the learning rate is 0.1

train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)# important step to initialize all variables init = tf.initialize_all_variables()sess = tf.Session()# There are no operations defined above until sess.run Will start to calculate sess.run(init)# Iterate 1000 times to learn, sess.run optimizerfori inrange(1000): # training train_step and loss are both operations defined by placeholder, so here we need to use feed to pass in the parameter sess.run(train_step , feed_dict={xs: x_data, ys: y_data})ifi% 50==0: #to see the step improvement

print(s

ess.run(loss, feed_dict={xs: x_data, ys: y_data}))

(三) Hard

1. Concept

Keras is written in pure Python and is based on Tensorflow, Theano and CNTK backends. It is equivalent to the upper interface of Tensorflow, Theano and CNTK. It is known as 10 lines of code to build a neural network. It has simple operation, easy to use, rich documentation, and easy environment configuration. And other advantages, simplifying the difficulty of writing neural network construction code. Currently, algorithms such as fully connected network, convolutional neural network, RNN and LSTM are packaged.

Keras has two types of models, sequential models (Sequential) and functional models (Model). Functional models are more widely used. Sequential models are a special case of functional models.

  1. Sequential model (Sequential): Single input and single output, one path leads to the end, there is only adjacent relationship between layers, and there is no cross-layer connection. This model has a fast compilation speed and relatively simple operation.

  2. Functional model (Model): Multi-input multi-output, arbitrary connection between layers. This model is slow to compile.

2. The module structure of Keras

Keras is mainly composed of 5 modules. The relationship between the modules and the function of each module are shown in Figure 3-1:

[Figure 3-1 keras module structure diagram]

3. Installation method

The installation of Keras consists of the following three steps:

  1. Install anaconda (python);

  2. Python distribution for scientific computing, supports Linux, Mac, and Windows systems, provides package management and environment management functions, and can easily solve the problems of coexistence and switching of multiple versions of Python and installation of various third-party packages;

  3. Use pip or conda to install numpy, keras, pandas, tensorflow and other libraries;

download link:

https://www.anaconda.com/what-is-anaconda/。

4. Use Keras to build a neural network

Using keras to build a neural network includes 5 steps, namely model selection, network layer construction, compilation, training, and prediction. The keras module used in the operation of each step is shown in Figure 3-2:

[3-2 Steps to build a neural network using keras]

5. Sample code

Kears builds a neural network to recognize handwritten digits. The specific code is as follows:

from keras.models import Sequential

from keras.layers.core import Dense, Dropout, Activation

from keras.optimizers import SGD

from keras.datasets import mnist

import numpy

‘’’

Step 1: Choose a model

‘’’

model = Sequential()

‘’’

Step 2: Build the network layer

‘’’

model.add(Dense(500,input_shape=(784,))) # Input layer, 28*28=784

model.add(Activation('tanh')) # The activation function is tanh

model.add(Dropout(0.5)) # Use 50% dropout

model.add(Dense(500)) # 500 hidden layer nodes

model.add(Activation(‘tanh’))

model.add(Dropout(0.5))

model.add(Dense(10)) # The output result is 10 categories, so the dimension is 10

model.add(Activation('softmax')) # The last layer uses softmax as the activation function

‘’’

Step 3: Compile

‘’’

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) # Optimize the function, set the learning rate (lr) and other parameters

model.compile(loss='categorical_crossentropy', optimizer=sgd, class_mode='categorical') # Use cross entropy as loss function

‘’’

Step 4: Training

Some parameters of .fit

batch_size: group the total number of samples, the number of samples contained in each group

epochs: number of training sessions

shuffle: Whether to randomly shuffle the data before training

validation_split: What percentage is used for cross-validation

verbose: screen display mode 0: no output 1: output progress 2: output each training result

‘’’

(X_train, y_train), (X_test, y_test) = mnist.load_data() # Use the mnist tool that comes with Keras to read the data (the first time you need to connect to the Internet)

Since the input data dimension of mist is (num, 28, 28), the following dimensions need to be directly put together into 784 dimensions.

X_train = X_train.reshape(X_train.shape[0], X_train.shape[1] * X_train.shape[2])

X_test = X_test.reshape(X_test.shape[0], X_test.shape[1] * X_test.shape[2])

Y_train = (numpy.arange(10) == y_train[:, None]).astype(int)

Y_test = (numpy.arange(10) == y_test[:, None]).astype(int)

model.fit(X_train,Y_train,batch_size=200,epochs=50,shuffle=True,verbose=0,validation_split=0.3)

model.evaluate(X_test, Y_test, batch_size=200, verbose=0)

‘’’

Step 5: Output

‘’’

print(“test set”)

scores = model.evaluate(X_test,Y_test,batch_size=200,verbose=0)

print("")

print(“The test loss is %f” % scores)

result = model.predict(X_test,batch_size=200,verbose=0)

result_max = numpy.argmax(result, axis = 1)

test_max = numpy.argmax(Y_test, axis = 1)

result_bool = numpy.equal(result_max, test_max)

true_num = numpy.sum(result_bool)

print("")

print(“The accuracy of the model is %f” % (true_num/len(result_bool)))

(4) Comparison of the advantages and disadvantages of the framework

Comparison dimension CaffeTensorflowKeras

The difficulty of getting started 1. No need to write code, just define the network structure in the .prototxt file to complete the model training. 2. The installation is complicated, and the design of the network structure inside the .prototxt file is relatively limited, and there is no convenience and freedom to design the network structure in Python. 3. The configuration file cannot be programmed to adjust the hyper-parameters, and it cannot easily support cross-validation, hyper-parameter Grid Search and other operations. 1. Easy to install, rich in teaching resources, and a basic model can be quickly built based on examples. 2. There is a certain threshold for use. Programming paradigms and mathematical statistics make it difficult for users with non-machine learning or data science backgrounds to learn. 3. Because of its flexibility, it is a relatively low-level framework. When using it, it needs to write a lot of code and reinvent the wheel. 1. The installation is simple, designed to allow users to carry out the fastest prototype experiment, and the process of turning ideas into results is the shortest, which is very suitable for the most cutting-edge research. 2. The API is easy to use. Users only need to put together advanced modules to design a neural network, which reduces the cost of comprehension when programming and reading other people's codes.

The framework maintains the GitHub project, which is maintained by the Berkeley Vision and Learning Center (BVLC). It is defined as the most popular and recognized open source deep learning framework, with an excellent framework structure, product-level high-quality code, developed and maintained by the Google team, and the ability to support it. Still developed and supported by the google team, the API is packaged in TensorFlow in the form of tf.keras; Microsoft maintains its CNTK backend; Amazon AWS is also developing MXNet support. Other supporting companies include NVIDIA, Uber, Apple (via CoreML).

Support language C++/CudaC++ python (Go, Java, Lua, Javascript, or R) Python

Encapsulation algorithm 1. The support for convolutional neural network CNN is very good, with a large number of well-trained classic models (AlexNet, VGG, Inception) and state-of-the-art (ResNet, etc.) models, which are stored in Model Zoo. 2. The support for time series RNN, LSTM, etc. is not particularly sufficient. 1. It supports CNN and RNN, as well as deep reinforcement learning and other computationally intensive scientific calculations (such as partial differential equation solving, etc.). 2. The calculation graph must be constructed as a static graph, which makes a lot of calculations difficult to achieve, especially the beam search that is often used in sequence prediction. 1. Support CNN and cyclic network, support cascaded model or arbitrary graph structure model, and switch from CPU calculation to GPU acceleration without any code changes. 2. There is no enhanced learning toolbox, it is troublesome to modify the implementation by yourself. The package is too advanced, the training details cannot be modified, and the penalty details are difficult to modify.

Model deployment 1. Stable program operation, high code quality, suitable for production environments with strict requirements on stability, the first mainstream industrial-grade deep learning framework. 2. The bottom layer of Caffe is based on C++, which can be compiled in various hardware environments and has good portability. It supports Linux, Mac and Windows, and can also be compiled and deployed to mobile device systems such as Android and iOS. 1. Good performance, can run multiple large-scale deep learning models at the same time, support model life cycle management, algorithm experiments, and can efficiently use GPU resources, so that the trained model can be put into the actual production environment more quickly and conveniently. 2. Flexible portability, the same code can be easily deployed to PCs, servers or mobile devices with any number of CPUs or GPUs without modification. 1. Easy to deploy, using TensorFlow, CNTK, and Theano as the backend, simplifying the complexity of programming and saving time to try new network structures. 2. The more complex the model, the greater the benefits, especially in models that are highly dependent on weight sharing, multi-model combination, and multi-task learning.

Performance currently only supports single-machine multi-GPU training, and does not support distributed training. Supports distributed computing, enables GPU or TPU (Tensor Processing Unit) clusters to perform parallel computing and jointly train a model. The communication between different devices is not optimized well, and the distributed performance has not reached the optimal level. Multi-GPU cannot be used directly. The processing speed of large-scale data is not as fast as other frameworks that support multi-GPU and distributed. When using TensorFLow backend, the speed is much slower than pure TensorFLow.

Author: Old Boys _Misaya
link: https: //www.jianshu.com/p/a507c3287e75
Source: Jane books
are copyrighted by the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

Guess you like

Origin blog.csdn.net/weixin_43214644/article/details/114673371