DL related basic concepts - to be added

1. Related to Deep Learning

From Deep Learning - Baidu Encyclopedia

1.1 Deep Learning

Deep learning (DL, Deep Learning) is
a new research direction in the field of machine learning (ML, Machine Learning). It is introduced into machine learning to make it closer to the original goal-artificial intelligence (AI, Artificial
Intelligence).
Deep learning is to learn the internal laws and representation levels of sample data. The information obtained during the learning process is of great help to the interpretation of data such as text, images and sounds. Its ultimate goal is to enable machines to have the ability to analyze and learn like humans, and to be able to recognize data such as text, images, and sounds.
Deep learning is a complex machine learning algorithm that has achieved results in speech and image recognition that far exceed previous related techniques.

Deep learning has achieved many results in search technology, data mining, machine learning, machine translation, natural language processing, multimedia learning, speech, recommendation and personalization technology, and other related fields. Deep learning enables machines to imitate human activities such as audio-visual and thinking, and solves many complex pattern recognition problems, making great progress in artificial intelligence-related technologies.

Introduction:
Deep learning is a general term for a class of pattern analysis methods. In terms of specific research content, it mainly involves three types of methods:
(1) A neural network system based on convolution operations, namely convolutional neural network (CNN).
(2) Autoencoder neural network based on multi-layer neurons , including Autoencoder and Sparse Coding, which have received widespread attention in recent years. (3) Pre-training in the form of a multi-layer self-encoded neural network, and then combined with the identification information to further optimize the deep belief network (DBN)
of the weight of the neural network . Through multi-layer processing, after gradually converting the initial "low-level" feature representation into a "high-level" feature representation, complex classification and other learning tasks can be completed with a "simple model". Therefore, deep learning can be understood as "feature learning" or "representation learning". In the past, when machine learning was used for real-world tasks , the features describing the samples usually had to be designed by human experts, which became "feature engineering". As we all know, the quality of features has a crucial impact on generalization performance, and it is not easy for human experts to design good features; feature learning (representation learning) generates good features through machine learning technology itself, which makes machine learning to " Automatic data analysis " has gone one step further. In recent years, researchers have gradually combined these types of methods, such as unsupervised pre-training of convolutional neural networks based on supervised learning combined with self-encoding neural networks.


, and then use the discriminative information to fine-tune the network parameters to form a convolutional deep belief network . Compared with traditional learning methods, deep learning methods preset more model parameters, so model training is more difficult. According to the general law of statistical learning, the more model parameters, the greater the amount of data that needs to participate in training.
In the 1980s and 1990s, due to the limited computing power of computers and the limitations of related technologies, the amount of data available for analysis was too small, and deep learning did not show excellent recognition performance in pattern analysis. Since Hinton et al. proposed the CD-K algorithm for quickly calculating the weights and deviations of the restricted Boltzmann machine (RBM) network in 2006, RBM has become a powerful tool for increasing the depth of neural networks, leading to the widespread use of DBN ( Developed by Hinton et al. and has been used by companies such as Microsoft in speech recognition) the emergence of deep networks. At the same time, sparse coding, etc., has also been applied to deep learning because it can automatically extract features from data. Convolutional neural network methods based on local data regions have also been extensively studied in recent years.

paraphrase

Deep learning is a kind of machine learning, and machine learning is the only way to realize artificial intelligence. The concept of deep learning originates from the research of artificial neural networks, and a multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data. The motivation for studying deep learning is to build a neural network that simulates the human brain for analysis and learning , which imitates the mechanism of the human brain to interpret data , such as images , sounds , and texts .

The calculations involved in producing an output from an input can be represented by a flow graph : a flow graph is a graph that can represent calculations , in which each node represents a basic calculation and a calculation The value, the result of the calculation is applied to the value of this node's child nodes. Consider a collection of computations that are allowed at every node and possible graph structure, and define a family of functions . Input nodes have no parents, and output nodes have no children.

A special property of such flow graphs is depth : the length of the longest path from an input to an output . A traditional feed-forward neural network can be viewed as having a depth equal to the number of layers (eg, the number of hidden layers plus 1 for the output layer). SVMs have depth 2 (one corresponding to the kernel output or feature space, and the other corresponding to a linear mixture of the resulting outputs).

One of the directions of artificial intelligence research is represented by the so-called "expert system", defined by a large number of "if-then" (If -Then) rules, and a top-down thinking. Artificial Neural Network (Artificial Neural
Network)
, marks another bottom-up thinking. A neural network does not have a strict formal definition. Its basic feature is to try to imitate the way the brain transmits and processes information between neurons .

1.2 Features

Different from traditional shallow learning, deep learning is different in that:
(1) It emphasizes the depth of the model structure, usually with 5, 6, or even 10 layers of hidden layer nodes;
(2) It clarifies the importance of feature learning sex. That is to say, through layer-by-layer feature transformation, the feature representation of the sample in the original space is transformed into a new feature space, thus making classification or prediction easier. Compared with the method of constructing features by artificial rules, using big data to learn features can better describe the rich internal information of data.
Through the design and establishment of an appropriate amount of neuron computing nodes and multi-layer computing hierarchy, select the appropriate input layer and output layer, and establish the functional relationship from input to output through network learning and tuning, although the input and output cannot be found 100%. The functional relationship of the output, but it can be as close as possible to the actual correlation. Using a successfully trained network model, we can realize our automation requirements for complex transaction processing.

1.3 Typical models of deep learning

Typical deep learning models include convolutional neural network (convolutional neural network), DBN and stacked auto-encoder network (stacked
auto-encoder network) models, etc. These models are described below.

Convolutional Neural Network Models
Before the advent of unsupervised pre-training, training deep neural networks was often very difficult, and one special case was convolutional neural networks. Convolutional neural networks are inspired by the structure of the visual system. The first convolutional neural network computing model was proposed in Fukushima's neurocognitive machine, based on local connections between neurons and hierarchically organized image transformations, applying neurons with the same parameters to the previous layer of neural networks The different positions of , get a translation invariant neural network structure form. Later, based on this idea, Le Cun et al. designed and trained convolutional neural networks with error gradients, and obtained superior performance in some pattern recognition tasks. So far, the pattern recognition system based on convolutional neural network is one of the best realized systems, especially showing extraordinary performance on handwritten character recognition tasks.
The deep trust network model
DBN can be interpreted as a Bayesian probability generation model, which consists of multiple layers of random hidden variables. The upper two layers have undirected symmetric connections, and the lower layers get top-down directed connections from the upper layer. , the state of the bottommost unit is the visible input data vector. DBN is composed of a stack of 2F structural units, and the structural units are usually RBM (Restricted Boltzmann Machine, restricted Boltzmann machine). The number of neurons in the visible layer of each RBM unit in the stack is equal to the number of neurons in the hidden layer of the previous RBM unit. According to the deep learning mechanism, the input samples are used to train the first layer of RBM unit, and its output is used to train the second layer of RBM model, and the RBM model is stacked to improve the model performance by adding layers. In the unsupervised pre-training process, after the DBN code is input to the top-layer RBM, the state of the top layer is decoded to the bottom unit to realize the reconstruction of the input. As the structural unit of DBN, RBM shares parameters with each layer of DBN.
Stacked autoencoder network model
The structure of the stacked autoencoder network is similar to that of DBN, consisting of several structural unit stacks, the difference is that its structural unit is an auto-en-coder model (auto-en-coder) instead of RBM. The self-encoding model is a two-layer neural network, the first layer is called the encoding layer, and the second layer is called the decoding layer.

1.4 Deep learning training process

In 2006, Hinton proposed an effective method for building a multi-layer neural network on unsupervised data , which is divided into two steps: first , build a single-layer neuron layer by layer , so that each time a single-layer network is trained ; when all layers After training, use the wake-sleep algorithm for tuning . Change the weights between layers except the topmost layer to be bidirectional , so that the topmost layer is still a single-layer neural network , while the other layers become graph models . Up weights are for " cognition " and down weights are for " generation ". Then use the wake-sleep algorithm to adjust all the weights. Let cognition and generation reach a consensus, that is, to ensure that the generated topmost representation can restore the underlying nodes as correctly as possible. For example, a node at the top layer represents a human face, then the images of all faces should activate this node, and the resulting image should be able to represent a rough human face image. The wake-sleep algorithm is divided into two parts: **wake (wake) and sleep (sleep)**. wake phase


: The cognitive process generates an abstract representation of each layer through external features and upward weights, and uses gradient descent to modify the downward weights between layers.
Sleep stage : the generation process, through the top-level representation and downward weights, the underlying state is generated, and the upward weights between layers are modified at the same time.

Bottom-up unsupervised learning
is to start from the bottom layer and train to the top layer layer by layer. Using uncalibrated data (with calibration data is also possible) to train the parameters of each layer in layers, this step can be regarded as an unsupervised training process, which is also the most different part from the traditional neural network, and can be regarded as a feature learning process. Specifically, the first layer is trained with uncalibrated data first, and the parameters of the first layer are learned during training. This layer can be regarded as a hidden layer of a three-layer neural network that minimizes the difference between the output and input. Due to the limited model capacity Restrictions and sparsity constraints enable the obtained model to learn the structure of the data itself, so as to obtain features that are more expressive than the input; after learning the nl layer, the output of the nl layer is used as the input of the nth layer, and the training of the first layer n layers, thus obtaining the parameters of each layer respectively. [6]

Top-down supervised learning
is to train with labeled data, and the error is transmitted from top to bottom to fine-tune the network. Based on the parameters of each layer obtained in the first step, the parameters of the entire multi-layer model are further optimized. This step is a supervised training process. The first step is similar to the initial value process of random initialization of the neural network. Since the first step is not random initialization, but obtained by learning the structure of the input data, this initial value is closer to the global optimum, which can achieve better results. Therefore, the good effect of deep learning is largely attributed to the feature learning process of the first step.

1.5 Deep Learning Applications

Computer Vision
The Multimedia Laboratory of the Chinese University of Hong Kong is the first Chinese team to apply deep learning to computer vision research. In the world-class artificial intelligence competition LFW (Large-Scale Face Recognition Competition), the laboratory beat FaceBook to win the championship, making the recognition ability of artificial intelligence in this field surpass that of real people for the first time. [
Speech recognition
Microsoft researchers, in cooperation with hinton, first introduced RBM and DBN into speech recognition acoustic model training, and achieved great success in large vocabulary speech recognition systems, making the error rate of speech recognition relatively reduced by 30%. However, DNN does not yet have an effective parallel fast algorithm, and many research institutions are using large-scale data corpus to improve the training efficiency of DNN acoustic models through the GPU platform.
Internationally, companies such as IBM and Google have rapidly conducted research on DNN speech recognition, and the speed is very fast.
Domestically, companies or research units such as Alibaba, HKUST Xunfei, Baidu, and the Institute of Automation of the Chinese Academy of Sciences are also conducting research on deep learning in speech recognition.
Many institutions in other fields such as natural language processing
are conducting research. In 2013, Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean published a paper Efficient Estimation of Word Representations in Vector Space to establish a word2vector model, which is comparable to the traditional bag of words model (bag of words ), word2vector can better express grammatical information. Deep learning is mainly used in machine translation and semantic mining in natural language processing and other fields.
In 2020, deep learning can accelerate semiconductor packaging and testing innovation. In terms of reducing repetitive labor, improving yield, controlling accuracy and efficiency, and reducing inspection costs, AI deep learning-driven AOI has broad market prospects, but it is not easy to control.
On April 13, 2020, in a medical and artificial intelligence (AI) study published in the British journal Nature Machine Intelligence, Swiss scientists introduced an artificial intelligence system that can scan cardiovascular blood flow within seconds. This deep learning model is expected to allow clinicians to optimize diagnostic workflow by observing changes in blood flow in real time while a patient undergoes an MRI scan.

Two, download

1-Download Anoconda-https://www.anaconda.com/download/
2-Download Pycharm-https://www.jetbrains.com/pycharm/
3-Download-Cuda-https://developer.nvidia.com/ cuda-toolkit-archive
4-download-cudnn-Note that the access is slow, you need to register an account before downloading
CUDA Toolkit Documentation v12.0
CUDA Toolkit Documentation 12.1 Update 1
https://docs.nvidia.com/deeplearning/cudnn/developer-guide /index.html
NVIDIA Developer Program Supporting the Community That's Changing the World
https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html

3. Tool software package introduction

3.1 Anaconda

Anaconda , the Chinese boa constrictor, is an open source Python distribution that includes more than 180 scientific packages such as conda and Python and their dependencies.
Anaconda includes Conda, Python, and a large number of installed toolkits, such as: numpy, pandas, etc.
Miniconda includes Conda, Python
conda is an open source package and environment manager, which can be used to install different versions of software packages and their dependencies on the same machine, and can switch between different environments.

#基本用法:
conda clean
conda config
conda create
conda help
conda info
conda install

conda list #罗列出所有已安装的科学包及其依赖项

conda package
conda remove
conda search
conda uninstall
conda update
conda upgrade

3.2 CUDA

CUDA, Compute Unified Device Architecture is NVIDIA, that is, the parallel computing framework launched by Nvidia for its own GPU . CUDA can only run on NVIDIA GPUs. The role of CUDA can only be played when the computing problem to be solved is a large number of parallel computing. .

CUDA is a platform and API (Application Programming Interface) for parallel computing that allows developers to use CUDA-enabled GPUs for parallel programming. The GPU cannot perform computing operations independently, and needs to be connected to the CPU through the PCI-Express bus (peripheral component interconnect express) to work together. Therefore, parallel computing using the GPU can be regarded as a heterogeneous computing architecture between the CPU and the GPU , and the CPU is responsible for processing complex logic . The serial part of the GPU is responsible for processing the data-intensive parallel computing part .

3.3 cuDNN

NVIDIA CUDA® Deep Neural Network library (cuDNN, CUDA Deep Neural Network library) is a GPU- accelerated library of deep neural network primitives that implement standard routines (such as forward and backward convolution , pooling , and layers , normalization and activation layers).

NVIDIA's accelerated library for deep neural networks is a GPU accelerated library for deep neural networks. If you want to use GPU to train the model, cuDNN is not necessary, but this acceleration library is generally used.

Deep learning researchers and framework developers around the world rely on cuDNN for high-performance GPU acceleration . With cuDNN, researchers and developers can focus on training neural networks and developing software applications instead of low-level GPU performance tuning. cuDNN accelerates widely used deep learning frameworks, including Caffe2, Chainer, Keras, MATLAB, MxNet, PaddlePaddle, PyTorch, and TensorFlow. To get NVIDIA-optimized deep learning framework containers with cuDNN integrated in the framework, visit NVIDIA GPU CLOUD to learn more and get started.

3.4 CPU&GPU

The central processing unit (Central Processing Unit, referred to as CPU ) is the computing and control core of the computer system, and is the final execution unit for information processing and program operation . Since the CPU was produced, it has made great progress in logic structure, operating efficiency and function extension.
The main functions of the CPU include command control , operation control , time control , data processing , and interrupt processing .

The CPU consists of ALU (arithmetic logic unit) , CU (controller) , registers ( PC-program counter , IR-instruction register , PSW , DR-data register , general-purpose register , etc.), and an interrupt system .

Graphics processing unit (English: graphics processing unit, abbreviation: GPU ), also known as display core, visual processor, and display chip, is a graphics processing unit designed for use in personal computers, workstations, game consoles, and some mobile devices (such as tablets, smartphones, etc.) etc.) to do image and graphics- related operations on the microprocessor.

The GPU reduces the graphics card 's dependence on the CPU and performs some of the original CPU work , especially in 3D graphics processing. The core technologies used by the GPU include hardware T&L (geometric transformation and lighting processing), cubic environment material maps and vertex blending, Texture compression and bump mapping, dual-texture quad-pixel 256-bit rendering engine, etc., and hardware T&L technology can be said to be the hallmark of GPU. GPU manufacturers mainly include NVIDIA and ATI.

四、Welcome to the NVIDIA cuDNN developer community

Below are resources to get you started.

4.1 Installation

Support Matrix: Platform and software version compatibility
Installation Guide: Step-by-step instructions for installation and upgrade

4.2 Documentation

Developer Guide: Overview of programming model, features and formats supported
cuDNN API: Reference for cuDNN datatypes and APIs

4.3 Community

Developer Forum: Browse introductory “how-to” questions or discuss advanced tips-and-tricks for your application with the community
Bug Reporting: Help improve cuDNN by reporting bugs and filing enhancement requests for the cuDNN team
Have feedback? Send us an email [email protected].

Guess you like

Origin blog.csdn.net/qyfx123456/article/details/130368957