Neural Networks and Deep Neural Networks, Graph Neural Networks and Neural Networks

What is the difference between a neural network and a deep neural network

Google AI Writing Project: Neural Network Pseudo-Original

What is the difference between deep learning and neural network

Looking for the difference between deep learning and neural networks, in fact, the main thing is: the original steps of multi-layer neural networks are: feature mapping to value good copywriting . Features are hand picked. The steps of deep learning are signal->feature->value. The features are chosen by the network itself.

In addition, deep learning is a new research direction in the field of machine learning. After being introduced into machine learning, machine learning can be closer to the original goal, which is artificial intelligence.

Deep learning is mainly to learn the internal laws and representation levels of sample data. The information obtained during these learning processes is of great help to the interpretation of data such as text, images and sounds.

Its ultimate goal is to enable machines to have the ability to analyze and learn like humans, and to be able to recognize data such as text, images, and sounds. Deep learning is a complex machine learning algorithm that has achieved results in speech and image recognition that far exceed previous related techniques.

Deep learning has achieved many results in search technology, data mining, machine learning, machine translation, natural language processing, multimedia learning, speech, recommendation and personalization technology, and other related fields.

Deep learning enables machines to imitate human activities such as audio-visual and thinking, and solves many complex pattern recognition problems, making great progress in artificial intelligence-related technologies. The neural network can be divided into two types, one is the biological neural network, and the other is the artificial neural network.

The biological neural network is the biological brain neurons, which are mainly composed of cells and contacts. The main function is to make the living beings conscious, or to help the living beings realize the purpose of thinking and acting. There are two kinds of neural networks, one is biological neural network and the other is artificial neural network.

Artificial Neural Networks (ANNs for short), also referred to as neural networks (NNs) or connection models (Connection Model), is a kind of algorithmic mathematics that imitates the behavior characteristics of animal neural networks and performs distributed parallel information processing. Model.

This kind of network depends on the complexity of the system, and achieves the purpose of processing information by adjusting the interconnection relationship between a large number of internal nodes. Artificial neural network: It is a mathematical model that uses a structure similar to the synaptic connection of the brain for information processing.

In engineering and academia, it is often referred to directly as "neural network" or neural network.

What is the difference between deep learning and neural network

The relationship between deep learning and neural network 2017-01-10 I recently started to learn deep learning. Basically, they are all articles written by blogger zouxy09. They are very well written and comprehensive. I will also delete and refine them according to my own ideas.

5. The basic idea of ​​Deep Learning Suppose we have a system S, which has n layers (S1,...Sn), its input is I, and its output is O, which is represented as: I =>S1=>S2=>… ..=>Sn => O, if the output O is equal to the input I, that is, the input I has no information loss after the system changes (hehe, Daniel said, this is impossible.

In information theory, there is a saying that "information is lost layer by layer" (information processing inequality). Suppose that the information of a is processed to obtain b, and then the information of b is processed to obtain c. Then it can be proved that the mutual information of a and c will not exceed the mutual information of a and b. information. This suggests that information processing does not add information, and most processing loses information.

Of course, it would be great if what was lost was useless information), kept unchanged, which means that the input I passes through each layer Si without any information loss, that is, in any layer Si, it is the original There is another representation of information (ie input I).

Now back to our topic Deep Learning, we need to automatically learn features, assuming we have a bunch of input I (such as a bunch of images or text), suppose we design a system S (with n layers), we adjust the parameters in the system, So that its output is still the input I, then we can automatically obtain a series of hierarchical features of the input I, namely S1,..., Sn.

For deep learning, the idea is to stack multiple layers, that is to say, the output of this layer is used as the input of the next layer. In this way, hierarchical expression of input information can be realized.

In addition, the previous assumption is that the output is strictly equal to the input. This restriction is too strict. We can relax this restriction slightly. For example, we only need to make the difference between the input and the output as small as possible. This relaxation will lead to another type of Deep learning method.

The above is the basic idea of ​​Deep Learning. 6. Shallow Learning and Deep Learning Shallow learning is the first wave of machine learning.

In the late 1980s, the invention of the backpropagation algorithm (also called Back Propagation algorithm or BP algorithm) for artificial neural networks brought hope to machine learning and set off a wave of machine learning based on statistical models. This craze continues to this day.

It was found that using the BP algorithm can allow an artificial neural network model to learn statistical laws from a large number of training samples, so as to make predictions for unknown events. This statistically based machine learning approach is superior in many respects to past manual rule based systems.

Although the artificial neural network at this time is also called a multi-layer perceptron (Multi-layer Perceptron), it is actually a shallow model containing only one layer of hidden layer nodes.

In the 1990s, various shallow machine learning models were proposed one after another, such as Support Vector Machines (SVM, Support Vector Machines), Boosting, and maximum entropy methods (such as LR, Logistic Regression), etc.

The structure of these models can basically be regarded as having a layer of hidden layer nodes (such as SVM, Boosting), or without hidden layer nodes (such as LR). These models have achieved great success in both theoretical analysis and application.

In contrast, due to the difficulty of theoretical analysis and the need for a lot of experience and skills in training methods, shallow artificial neural networks were relatively quiet during this period. Deep learning is the second wave of machine learning.

In 2006, Geoffrey Hinton, a professor at the University of Toronto in Canada and a master in the field of machine learning, and his student Ruslan Salakhutdinov published an article in "Science", which opened a wave of deep learning in academia and industry.

This article has two main points of view: 1) The multi-hidden layer artificial neural network has excellent feature learning ability, and the learned features have a more essential description of the data, which is conducive to visualization or classification; 2) Deep neural networks in The difficulty of training can be effectively overcome by "layer-wise pre-training". In this article, layer-wise initialization is achieved through unsupervised learning.

Most of the current learning methods such as classification and regression are shallow structure algorithms, which are limited in the ability to express complex functions under the condition of limited samples and computing units, and their generalization ability for complex classification problems is restricted to a certain extent.

Deep learning can achieve complex function approximation by learning a deep nonlinear network structure, characterize the distributed representation of input data, and demonstrate a powerful ability to learn the essential characteristics of data sets from a small number of sample sets.

(The benefit of multiple layers is that complex functions can be expressed with fewer parameters.) The essence of deep learning is to learn more useful features by building a machine learning model with many hidden layers and massive training data, so as to ultimately improve classification or forecast accuracy.

Therefore, "deep model" is the means, and "feature learning" is the purpose.

Different from traditional shallow learning, deep learning differs in that: 1) It emphasizes the depth of the model structure, usually with 5, 6, or even 10 layers of hidden layer nodes; 2) It clearly highlights the importance of feature learning , that is to say, through layer-by-layer feature transformation, the feature representation of the sample in the original space is transformed into a new feature space, thus making classification or prediction easier.

Compared with the method of constructing features by artificial rules, using big data to learn features can better describe the rich internal information of the data.

7. Deep learning and Neural Network Deep learning is a new field in machine learning research. Its motivation is to establish and simulate the neural network of the human brain for analysis and learning. It imitates the mechanism of the human brain to explain data, such as images, sounds and text.

Deep learning is a type of unsupervised learning. The concept of deep learning originated from the study of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data.

Deep learning itself is a branch of machine learning, which can be simply understood as the development of neural network.

About twenty or thirty years ago, the neural network used to be a particularly hot direction in the ML field, but it gradually faded out. The reasons include the following aspects: 1) It is relatively easy to overfit, and the parameters are difficult to tune. Less tricks; 2) The training speed is relatively slow, and the effect is not better than other methods when there are fewer layers (less than or equal to 3); so there are about 20 years in the middle, and the neural network has received little attention. Above is the world of SVM and boosting algorithms.

However, an infatuated old gentleman Hinton, he persisted, and finally (with Bengio, Yann.lecun, etc.) proposed a practical deep learning framework.

There are many similarities and differences between Deep learning and traditional neural networks.

The difference between the two lies in that deep learning adopts a similar layered structure to neural networks. The system consists of a multi-layer network consisting of an input layer, a hidden layer (multi-layer), and an output layer. Only nodes in adjacent layers are connected. And there is no connection between the cross-layer nodes, each layer can be regarded as a logistic regression model; this layered structure is relatively close to the structure of the human brain.

In order to overcome the problems in neural network training, DL adopts a very different training mechanism from neural networks.

In the traditional neural network (here, the author mainly refers to the forward neural network), the method of back propagation is adopted. Simply put, it is to use an iterative algorithm to train the entire network, set the initial value randomly, calculate the output of the current network, and then According to the difference between the current output and the label, the parameters of the previous layers are changed until convergence (the whole is a gradient descent method).

And deep learning is a layer-wise training mechanism as a whole.

The reason for this is that if the back propagation mechanism is used, for a deep network (more than 7 layers), the residual propagation to the front layer has become too small, and the so-called gradient diffusion (gradient diffusion) appears.

We discuss this issue next.

Eight. Deep learning training process 8.1. Why can’t traditional neural network training methods be used in deep neural networks? .

The ubiquitous local minima in non-convex objective cost functions of deep architectures (involving multiple layers of nonlinear processing units) are the main source of training difficulties.

Problems in the BP algorithm: (1) The gradient is getting sparser: the error correction signal is getting smaller and smaller from the top layer; (2) Converging to the local minimum: especially when it starts away from the optimal area (random Value initialization will cause this to happen); (3) Generally, we can only use labeled data for training: but most of the data is unlabeled, and the brain can learn from unlabeled data; 8.2 In the deep learning training process, if all layers are trained at the same time, the time complexity will be too high; if one layer is trained each time, the deviation will be transmitted layer by layer.

This will face the opposite problem of the supervised learning above, and it will be seriously underfitting (because there are too many neurons and parameters in the deep network).

In 2006, hinton proposed an effective method of building a multi-layer neural network on unsupervised data. Simply put, it is divided into two steps, one is to train a layer of network each time, and the other is to tune it so that the original representation x is generated upwards. The high-level representation r of the high-level representation r is as consistent as possible with the x' generated downward by the high-level representation r.

The method is: 1) First build a single-layer neuron layer by layer, so that each time a single-layer network is trained. 2) When all layers are trained, Hinton uses the wake-sleep algorithm for tuning.

Change the weights between the other layers except the topmost layer to be bidirectional, so that the topmost layer is still a single-layer neural network, while the other layers become graph models. Up weights are for "cognition" and down weights are for "generation". All weights are then adjusted using the Wake-Sleep algorithm.

Let cognition and generation reach a consensus, that is, to ensure that the generated top-level representation can restore the bottom-level nodes as correctly as possible.

For example, a node at the top layer represents a human face, then images of all faces should activate this node, and the resulting image should be able to represent a rough human face image. The Wake-Sleep algorithm is divided into two parts: wake and sleep.

1) Wake stage: the cognitive process, which generates an abstract representation (node ​​state) of each layer through external features and upward weights (cognitive weights), and uses gradient descent to modify the downlink weights between layers (generated weights).

That is, "if reality is different from what I imagined, change my weights so that what I imagine is like this". 2) Sleep stage: the generation process, through the top-level representation (concepts learned during waking) and downward weights, the underlying state is generated, and the upward weights between layers are modified at the same time.

That is, "if the scene in the dream is not the corresponding concept in my mind, change my cognitive weight so that the scene appears to me to be the concept".

The deep learning training process is as follows: 1) Use bottom-up unsupervised learning (that is, start from the bottom layer and train layer by layer to the top layer): use uncalibrated data (calibrated data is also available) to train the parameters of each layer layer by layer, This step can be regarded as an unsupervised training process, which is the most different part from the traditional neural network (this process can be regarded as a feature learning process): specifically, first train the first layer with uncalibrated data, and learn first during training The parameters of the first layer (this layer can be regarded as a hidden layer of a three-layer neural network that minimizes the difference between output and input), due to the limitation of model capacity and sparsity constraints, the obtained model can learn the data itself structure, so as to obtain features that are more expressive than the input; after learning the n-1th layer, the output of the n-1 layer is used as the input of the n-th layer, and the n-th layer is trained, thereby obtaining each layer. Parameters; 2) Top-down supervised learning (that is, training through labeled data, error transmission from top to bottom, and fine-tuning the network): further fine-tune the entire multi-layer based on the parameters of each layer obtained in the first step The parameters of the model, this step is a supervised training process; the first step is similar to the random initialization initial value process of the neural network, because the first step of DL is not random initialization, but obtained by learning the structure of the input data, so this initial The value is closer to the global optimum, so that better results can be achieved; so the good effect of deep learning is largely due to the feature learning process of the first step.

What is the difference between Convolutional Neural Networks and Deep Neural Networks

The main difference is that in the multi-layer perceptron, the layer definition and depth processing methods are different. The deep neural network imitates the way of thinking of the human brain, and first builds a single-layer neuron layer by layer, so that each time a single-layer network is trained. When all layers are trained, use the wake-sleep algorithm for tuning.

A convolutional neural network is mediated by a "convolution kernel". The same convolution kernel is shared in all images, and the image still retains the original positional relationship after the convolution operation.

What is the difference between deep learning and neural networks

These two concepts actually intersect each other. For example, Convolutional neural networks (CNNs for short) is a machine learning model under deep supervised learning, and Deep Belief Nets (DBNs for short) It is a machine learning model under unsupervised learning.

The concept of deep learning originated from the study of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data.

The concept of deep learning was proposed by Hinton et al. in 2006. Based on the deep belief network (DBN), a non-supervised greedy layer-by-layer training algorithm is proposed, which brings hope to solve the optimization problems related to the deep structure, and then a multi-layer autoencoder deep structure is proposed.

In addition, the convolutional neural network proposed by Lecun et al. is the first real multi-layer structure learning algorithm, which uses the spatial relative relationship to reduce the number of parameters to improve training performance.

What is the difference between Convolutional Neural Networks and Deep Neural Networks

There is no such thing as a convolutional neural network, only a convolution kernel. The real value of computer image processing lies in the fact that once the image is stored on the computer, various effective manipulations can be performed on the image.

For example, reducing the color value of pixels can solve the problem of overexposure, blurred images can also be sharpened, and clear images can be blurred to simulate the soft effect of camera color filters. With image processing software such as Photoshop, the magic that can be cast is almost endless.

The four basic image processing effects are Blur, Sharpen, Emboss, and Watercolor. ß These effects are not hard to achieve, and the magic part of them is a small matrix called a convolution kernel. This 3*3 kernel contains nine coefficients.

In order to transform a pixel in the image, first multiply the pixel value by the coefficient in the center of the convolution kernel, then multiply the eight pixels around the pixel by the other eight coefficients in the convolution kernel, and finally add the nine products , the result as the value of this pixel.

This process is repeated for every pixel in the image, filtering the image. Different processing effects can be obtained by using different convolution kernels. ß With PhotoshopCS6, you can easily process the image.

Blur processing - the blurred convolution kernel is composed of a set of coefficients, each of which is less than 1, but their sum is exactly equal to 1, each pixel absorbs the color of the surrounding pixels, and the color of each pixel is scattered to it Surrounding pixels, some harsh edges are softened in the resulting image.

The coefficient of the center of the sharpening convolution kernel is greater than 1, and the absolute value of the sum of the surrounding eight coefficients is smaller than the middle coefficient by 1, which will expand the difference between the color of a pixel and its surrounding pixels, and the final image is clearer than the original image .

The cumulative sum of the coefficients in the relief convolution kernel is equal to zero, the value of the background pixel is zero, and the value of the non-background pixel is non-zero. The pattern on the photo is like a relief on a metal surface, the outlines seem to protrude from the surface.

To perform watercolor processing, first smooth the color in the image, put the color value of each pixel and its surrounding twenty-four adjacent pixel color values ​​in a table, and then sort from small to large, Use a color value in the middle of the table as the color value of this pixel.

Each pixel in the image is then processed with a sharpening convolution kernel to make the outlines more prominent, and the resulting image resembles a watercolor painting. We combine some image processing techniques to produce some uncommon optical effects, such as halos and so on. Hope I can help clear up your doubts.

Why are there graph convolutional neural networks?

In essence, all the data in the world are topological structures, that is, network structures. If these network data can be truly collected and integrated, this is indeed the first step in realizing AI intelligence.

Therefore, how to use deep learning to process these complex topological data, and how to create new intelligent algorithms for processing graph data and knowledge graphs are an important direction of AI.

The success of deep learning in many fields is mainly attributed to the rapid development of computing resources (such as GPU), the collection of a large amount of training data, and the ability of deep learning to extract latent representations from Euclidean data (such as images, texts, and videos). effectiveness.

However, while deep learning has achieved much success with Euclidean data, data generated from non-Euclidean domains has achieved wider applications, and they need to be analyzed effectively.

For example, in the field of e-commerce, a graph-based learning system can exploit the interaction between users and products to achieve highly accurate recommendations. In the field of chemistry, molecules are modeled as graphs, and the development of new drugs requires the determination of their biological activity.

In the paper citation network, papers are connected to each other through citation relations, and they need to be divided into different categories. Since 2012, deep learning has achieved great success in computer vision and natural language processing.

Suppose there is a picture to be classified, the traditional method needs to manually extract some features, such as texture, color, or some more advanced features. Then put these features into a classifier like random forest, give an output label, tell it which category it is.

And deep learning is to input a picture, pass through the neural network, and directly output a label. Feature extraction and classification are in place in one step, avoiding manual feature extraction or manual rules, and automatically extracting features from raw data is a kind of end-to-end (end-to-end) learning.

Compared with traditional methods, deep learning can learn more efficient features and patterns. The complexity of graph data poses significant challenges to existing machine learning algorithms because graph data is irregular.

Each graph has different sizes and nodes are out of order, and each node in a graph has a different number of adjacent nodes, so that some important operations (such as convolution) that are easy to calculate in images can no longer be directly applied to graphs. Furthermore, a core assumption of existing machine learning algorithms is that instances are independent of each other.

However, each instance in graph data is related to other instances around it, containing some complex connection information to capture dependencies between data, including references, friendships, and interactions. Recently, more and more studies have begun to apply deep learning methods to the field of graph data.

Driven by advances in the field of deep learning, researchers have borrowed ideas from convolutional networks, recurrent networks, and deep autoencoders when designing the architecture of graph neural networks. To cope with the complexity of graph data, the generalization and definition of important operations have developed rapidly in the past few years.

What is the difference between machine learning and deep learning?

After the traditional machine learning method obtains the data, it first needs to manually define the feature template, and then combine it with Logistics, SVM and other classifiers for training and prediction.

The accuracy of machine learning prediction mainly depends on the definition of feature templates. Some good feature templates often require domain experts to spend a lot of time observing and summarizing data. The biggest difference between deep learning and machine learning is that deep learning can automatically summarize structural features from data.

Deep learning has many different network structures, such as convolutional neural network, long short-term memory network, graph convolutional neural network, etc. These network structures can automatically extract high-level features from text, images, and speech. These automatically learned Features are often better than human-defined features.

Machine learning needs to define feature templates manually, and people's cognition of the world is limited. In the process of defining feature templates, it will inevitably cause information loss and affect the prediction accuracy of the model.

Deep learning can automatically learn some better features from a large amount of data. Therefore, when there is enough training data, the upper limit of deep learning models is often much higher than that of traditional machine learning methods.

 

Guess you like

Origin blog.csdn.net/Supermen333/article/details/127443740