Chapter 1 Introduction
1. General introduction
In the early days of artificial intelligence, problems that were very difficult for human intelligence but relatively simple for computers were solved quickly, for example, those that could be described by a set of formalized mathematical rules. The real challenge for AI lies in solving tasks that are easy for humans to perform but difficult to describe formally , such as recognizing what people say or faces in images. We humans can often solve these problems easily and intuitively.
Letting the computer acquire knowledge from experience avoids the need for a human to formally specify all the knowledge it needs to the computer. Hierarchical concepts allow computers to construct simpler concepts to learn complex concepts. If we map how these concepts build on top of each other, we get a ''deep'' (many layers) map. For this reason, we call this approach AI deep learning .
Ironically, abstract and formal tasks are among the most difficult mental tasks for humans, but the easiest for computers. Computers have long been able to beat the best human chess players, but only recently have computers reached the average human level at recognizing objects or speech tasks. A person's daily life requires an enormous amount of knowledge about the world. Much of this knowledge is subjective and intuitive, so it is difficult to express clearly in a formal way. Computers need to acquire the same knowledge in order to behave intelligently. A key challenge for AI is how to communicate this informal knowledge to computers .
Informal knowledge representation of artificial intelligence has generally gone through four stages: hard coding -> machine learning -> representation learning -> deep learning. The representative products of each stage are as follows:
1. Hardcoding: Knowledge Base
2. Machine Learning: Extract suitable feature sets and feed them to suitable machine learning algorithms
3. Representation learning: Autoencoders
3. Deep learning: such as feedforward Deep networks or multilayer perceptrons (MLPs)
thus give rise to different AI disciplines, as shown in the following diagram:
An example is used to help understand the learning process of deep learning. Deep learning goes from the first layer to the last layer, and feature extraction is from concrete to abstract, as shown in the following figure:
2. Who this book is for
slightly
3. Historical Trends of Deep Learning
Key trends of deep learning:
1. Deep learning is not a recent phenomenon, and other names of "deep learning" that contain deep learning ideas have long existed;
2. Improve, deep learning is gradually feasible and the effect is excellent;
1. The many names and fates of neural networks
Summarized in the following table:
stage | name | Remark |
---|---|---|
1940s to 1960s | cybernetics | Produced in the context of neuroscience, typical product: adaptive linear unit; stochastic gradient descent algorithm |
1980s to 1990s | Connectionism/Parallel Distributed Processing | Produced in the context of cognitive science, typical product: distributed representation; backpropagation |
Since 2006 | deep learning | 2016 Breakthrough: Deep Belief Networks |
2. Growing data volumes
As training data increases, the skills required for deep learning are decreasing. A big trend right now is that the size of benchmark datasets has increased significantly over time.
As of 2016, a rough rule of thumb is that supervised deep learning algorithms will generally achieve acceptable performance given ~5000 labeled samples per class, when a dataset of at least 10 million labeled samples is used for training , it will meet or exceed human performance. Furthermore, achieving success on smaller datasets is an important area of research, for which we should specifically focus on how to take advantage of large numbers of unlabeled samples through unsupervised or semi-supervised learning.
3. Increasing model size
Initially, the number of connections between neurons in artificial neural networks was limited by hardware capabilities. Now, the number of connections between neurons is mostly by design. Some artificial neural networks have as many connections per neuron as cats, and for others it is very common to have as many connections per neuron as in smaller mammals such as mice. Even the human brain does not have an exorbitant number of connections per neuron.
Neural networks until recently were surprisingly small in terms of the total number of neurons. Since the introduction of hidden units, artificial neural networks have doubled in size roughly every 2.4 years. This growth is driven by larger memory, faster computers, and larger data sets available. Larger networks enable higher accuracy on more complex tasks. This trend looks set to continue for decades. Unless there are new technologies capable of scaling rapidly, artificial neural networks will not be able to have neurons on the same order of magnitude as the human brain until at least the 2050s.
It is no surprise now that a neural network with fewer neurons than a leech cannot solve complex AI problems. Even though today's networks, which may be quite large from a computing system perspective, are actually smaller than the nervous systems of relatively primitive vertebrates such as frogs .
4. Increasing precision, complexity and impact on the real world
- Models are becoming increasingly complex and reaching amazing accuracy;
- It is more widely used and has a significant impact on image processing, speech recognition, pedestrian detection, machine translation, etc. This trend of increasing complexity has pushed it to the logical conclusion, namely Neural Turing Machine (Graves et al., 2014) introduction;
- Expansion in the field of reinforcement learning
- 。。。