[Artificial Intelligence] Talk about the history of artificial intelligence and artificial neural network

Artificial Intelligence and Machine Learning

Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can respond in a manner similar to human intelligence. Research in this field includes robotics, language recognition, image recognition , natural language processing and expert systems , etc. Since the birth of artificial intelligence, the theory and technology have become increasingly mature, and the application fields have also continued to expand. It can be imagined that the technological products brought by artificial intelligence in the future will be the "container" of human wisdom. Artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is not human intelligence, but it can think like human beings, and it may surpass human intelligence.

The development history of artificial intelligence:

image-20220504195314102

Logical reasoning: In the ten years after the Dartmouth conference in 1956, artificial intelligence ushered in its first peak. Most early researchers summarized some rules based on human experience, based on logic or facts, and then programmed them to Let the computer complete a task .

Knowledge engineering: In the 1970s, researchers realized the importance of knowledge for artificial intelligence systems. Especially for some complex tasks, experts are needed to build the knowledge base. Expert system can be simply understood as "knowledge base + reasoning machine", which is a kind of computer intelligent program system with specialized knowledge and experience.

Machine learning: For many intelligent behaviors of human beings, such as language understanding, image understanding, etc., it is difficult for us to know the principles, nor can we describe the "knowledge" behind these intelligent behaviors. It also leads to an intelligent system that is difficult to realize these behaviors through knowledge and reasoning. To solve these kinds of problems, researchers have turned to letting computers learn from data on their own.

Artificial intelligence technology level progressive process

image-20220504195337855

The Origins of Machine Learning : The Perceptron Mathematical Model of the 1950s.

  • Development: Since the mid-1990s, machine learning has developed rapidly and gradually replaced traditional expert systems as the mainstream core technology of artificial intelligence , making artificial intelligence gradually enter the era of machine learning.

The difference between machine learning and deep learning: Machine learning is a method to achieve artificial intelligence, and deep learning is a technology to achieve machine learning.

Nouns in the field of artificial intelligence

AI

Artificial intelligence development history

image-20220504202006114

Definition of Machine Learning

Definition of Machine Learning 1

The Father of Machine Learning: Arthur Samuel coined the term "Machine Learning" and defined it as: the field of study in which a machine can be given a certain skill without deterministic programming.

"Machine learning is the field that gives computers the ability to learn that (the ability to learn) is not * obviously programmed **."

​ ——Arthur Samue

So, what is explicit programming? How is it different from non-obvious programming?

What is explicit programming?

The computer executes the program through the predetermined logic code, enters the code logic layer for the content of the input layer, and returns the result obtained through the predetermined logic sequence to the output layer.

For example:

  1. The robot goes outside the classroom and makes us a cup of coffee:

    First, we send instructions to the robot to turn left and take a few steps. The robot executes each step according to the pre-set program.

  2. Let's make the computer recognize chrysanthemums and roses:

    We will tell the computer through code in advance: yellow represents chrysanthemums, and red represents roses. Then, if the computer recognizes yellow, it means chrysanthemums, and if it recognizes red, it means roses.

    image-20220504210425013

From this we can conclude that explicit programming requires us to help the program understand the execution environment in advance, and let it execute step by step according to the original established logic.

What is non-obvious programming?

The computer automatically learns and summarizes through data and experience, and completes the tasks assigned by humans.

During execution, the program often has a "income function" and "activation function" to analyze the weight of a current event and make a decision on the next behavior execution.

Take the example above as an example:

  1. When a robot makes coffee for us:

    Human beings stipulate that robots can take a series of behaviors, and the benefits brought by the robots doing these behaviors in a specific environment are called "benefit functions". For example, if the robot falls by itself, the profit function is negative; if the robot falls by itself, the profit function is negative; if the robot gets coffee by itself, the profit function is positive.

  2. When we identify chrysanthemums and roses by computer:

    Let the computer sum up the difference between chrysanthemums and roses: if the petals are long and yellow, it is likely to be a chrysanthemum; if the petals are round and red, it is likely to be a rose. That is, the computer can summarize the chrysanthemum through a large number of pictures. Yellow, roses are red.

    in the process of forming this rule. We do not restrict the rules that the computer must conclude in advance, but let the computer pick out some rules that can best distinguish chrysanthemums from roses .

    Therefore, when recognizing an image, when the program sees a yellow one, the computer’s profit function based on the chrysanthemum is positive, and the petals are observed to be long, the profit function is also positive, and the profit value at this moment reaches the activation function. Chrysanthemum weight requirements, it can be deduced that the petal is a chrysanthemum.

    image-20220504211402929

    So in non-salient programming, after we specify the behavior and revenue function, let the computer find the behavior that maximizes the revenue function. The computer adopts randomized behavior. As long as our program is good enough, the computer may find a behavior pattern that maximizes the revenue function , and use the "activation function" as the execution condition of a certain behavior.

Comparison of two programming methods

It can be seen that non-obvious programming allows computers to automatically learn through data and experience to complete the tasks we assign.

It is this non-obvious programming that machine learning focuses on.

Definition of Machine Learning 2

Definition from Tom Mitshell in his book "Machine Learning" in 1998:

​ “A computer program is said to learn from experience Ewith respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. ”

"A computer program is called learnable, which means that it can learn from experience E for a task T and a certain performance index P. The characteristic of this kind of learning is that its performance on T is measured by P The performance will increase with the increase of experience E."

​ ——Tom Mitshell 《Machine Learning》

How to express this concept? We use the above-mentioned computer recognition program of roses and chrysanthemums to delineate the task T, experience E and performance index P mentioned by Tom Mitshell. in:

  • Task T (Task): Write a computer program to identify, chrysanthemums and roses;

  • Experience E (Experience): List pictures of chrysanthemums and roses (data set);

  • Performance index P (Performance Measure): The test set verification result of the computer and the probability of identifying the correct one within the specified time of operation . This is a machine learning algorithm designed for task T and experience E. Through this algorithm, the knowledge of experience E can be obtained to launch the best fitting model in line with task T , and the process of improving performance index P according to experience E (a typical optimization problem) ). Performance is reflected in the ability of the existing model to process new data (accuracy Performance Measure)

    The machine learning algorithms designed for different tasks T will be different.

    Test set: After the computer has trained the model, it will designate a certain sample as the test set, and use it as the image recognition in the data set based on the completed model, and verify the probability of the image model being tested successfully in the test set, which is the test set verification As a result, [take out the test paper and test yourself, see how many questions you can answer]

Therefore, from the definition of Tom Mitshell, it can be concluded that machine learning is a data model that meets the performance index P of the best fit by constructing a design for the task T and experience E.

image-20220504234827551

At the same time, in the machine learning algorithm designed for task E and experience E, the characteristic of this algorithm is that as the number of experience E increases, the performance index P will become higher and higher (just like learning will only get better and better). Smart, there may be data fluctuations in this process, because you may forget it after learning it, but it will continue to improve only after repeated training).

The definition proposed by Tom Mitshell is also based on non-obvious programming, and explicit programming determines the input and output of the program from the beginning. The recognition rate does not change as the number of samples increases.

So let's try to introduce the design concept of task T, experience E and performance index P in the process of making coffee by the robot:

Task T: Design a program for the robot to make coffee

Experience E: The actions of the robot's repeated attempts and the results of these actions are successful within the specified time

Performance measure P: number of times coffee is brewed

summary

Through the introduction of the previous two definitions, we understand that mathematics plays an important role in modern machine learning.

Machine learning is an important branch of artificial intelligence and the key to realizing intelligence

Classic definition: It is a way to improve the performance of the computer system itself (model) through experience. Modern machine learning methods are mainly designed with reference to Definition 2.

Existing machine learning frameworks are mainly designed based on the following points:

  • **Experience:**In a computer system, it is data (set). Corresponding to historical data, such as Internet data, scientific experiment data, etc.
  • **System:** corresponds to the data model, such as decision tree, support vector machine, etc.
  • **Performance:** is the ability of the model to process new data, such as classification and prediction performance.
  • **Main goal: **Intelligent analysis and modeling of data. Predict the unknown, understand the system

Artificial Neural Network and Its Development

Before officially starting to introduce neural networks, let's look at a few simple math problems:

  1. Is 3 positive or negative?
  2. Which quadrant on the axis does (1,1) belong to?

After fully mobilizing various senses and accumulating knowledge of 90% compulsory education, it seems too easy for us to answer these questions, so easy that we feel that these questions themselves are not very meaningful.

But once you want to understand the mechanism by which the human brain solves these problems: how does the visual nerve work in these problems? How does information flow between neurons?

How to make the computer also have the ability to answer these questions, things become not so simple.

Neural network is a bionic model, which imitates the human neural network to process and analyze events, and obtain an optimal solution to the problem.

So let's first understand the structure and working principle of the biological neuron

insert image description here
A neuron is a cell that receives and emits impulses and has dendrites and axons outside the core of the cell body; the
dendrites receive impulses from other neurons, and the axon transmits the neuron's output impulses to other neurons ;

The output transmitted by a neural source to different neurons is the same, and the combined interaction of countless biological neurons in the exchange of information in the prominent part forms a biological neural network, which enables people to have the ability to process complex information

Artificial neural network concept

Then, the artificial neural network also tries to imitate the neural source form corresponding to the working principle of the biological neural network. Aritificial Neural Network (ANN) is a theoretical mathematical model of the neural network of the human brain , and is an information processing system based on imitating the structure and function of the neural network of the brain . It is actually a complex network composed of a large number of simple components connected to each other , with a high degree of nonlinearity , a system capable of complex logic operations and nonlinear relationships.

Artificial neural networks are also known as "neural networks" or "artificial nervous systems" . It is common to abbreviate artificial neural networks and refer to them as "ANN" or simply "NN".

For a system to be considered a neural network, it must contain a labeled graph structure where each node in the graph performs some simple computation. From graph theory, we know that a graph consists of a set of nodes (i.e. vertices) and a set of connections (i.e. edges) that connect pairs of nodes together.

In the figure below we can see an example of such a NN graph.

QQ picture 20220505003920

A simple neural network architecture. The input is presented to the network. Each connection carries a signal through two hidden layers in the network. The last function computes the output class labels.

Each node performs a simple computation. Each connection then carries a signal (i.e., the output of a computation) from one node to another, marked with a weight indicating how much the signal is amplified or attenuated. Some connections have positive weights that amplify the signal, indicating that the signal is important for classification. Others have *negative weights* which reduce the strength of the signal such that the output of the specified node is less important in the final classification. We call such a system an artificial neural network

Construction Principles of Artificial Neural Networks

  • It is composed of a certain number of basic unit hierarchical connections;
  • The input and output signals and comprehensive processing content of each unit are relatively simple;
  • The learning and knowledge storage of the network is reflected in the connection strength between each unit.

image-20220505004537553

Application field and development history of artificial neural network

Voice recognition: Siri's intelligent voice recognition;

Autonomous driving: the mainstream research direction of artificial intelligence today;

Search engine optimization, recommendation algorithm, language translation: Baidu, Google, recommendation algorithm;

Man-machine game algorithm: AlphaGo Go algorithm;

Smart home, placement: face recognition, intelligent monitoring equipment.

The history of artificial intelligence from the perspective of machine learning

image-20220505005054151

Three schools of thought in machine learning

  1. Symbolists

    • Cognition is computation, predicting outcomes through deduction and inverse deduction of symbols

    • Representative algorithm: inverse deduction algorithm (Inverse deduction)

    • Representative Application: Knowledge Graph

      image-20220505005548196

  2. Connectionist

    • simulate the brain

    • Representative algorithms: Backpropagation algorithm (Backpropagation), deep learning (Deep learning)

    • Representative applications: machine vision, speech recognition

      image-20220505005556875

  3. Behaviorism (Analogizer)

    • similarity between old and new knowledge

    • Representative algorithms: Kernel machines, Nearest Neightor

    • Representative application: Netflix recommendation system

      image-20220505005604003

There are also two schools of research:

  • Artificial intelligence bionics school:

    • Artificial intelligence simulates the human brain's understanding of the world. Studying the cognitive mechanism of the brain and summarizing the way the brain processes information is a prerequisite for the realization of artificial intelligence.
  • The Mathematical School of Artificial Intelligence:

    • At present and in the foreseeable future, we cannot fully understand the cognitive mechanism of the human brain. Computers and human brains have completely different physical properties and architectures.

An overview of the development of neural networks

image-20220505010548194

insert image description here

Deep learning has a long history of development, but only gradually matured after 2010.

The troika of deep learning

image-20220505011430892

Top experts in modern deep learning:

image-20220505012029496

  • Wu Enda: In 2011, Wu Enda established the "Google Brain" project at Google, which uses Google's distributed computing framework to calculate and learn large-scale artificial neural networks. The important research result of this project is that the neural network with 1 billion parameters learned by deep learning algorithm on 16,000 CPU cores can learn to recognize it just by watching unlabeled YouTube videos without any prior knowledge. High level concept.
    • In 2008, Andrew Ng was selected as one of "the MIT Technology Review TR35", which is one of the 35 top innovators in the world under the age of 35 selected by the magazine "MIT Technology Entrepreneurship"
    • Recipient of the "Computers and Minds Award".
    • In 2013, Andrew Ng was selected as one of the 100 most influential people in the world by "Time" magazine, becoming one of the 16 representatives of the technology industry. ,
  • Fei-Fei Li: Inventor of ImageNet and ImageNet Challenge, contributing to the latest developments in deep learning and AI. In addition to her technical contributions, she is a leader advocating for diversity in STEM (Science, Technology, Engineering and Math Education) and AI (Artificial Intelligence).
  • lan Goodfellow: Famous for proposing Generative Adversarial Networks (GANs), he is known as the "father of GANs" and has even been elected as a top expert in the field of artificial intelligence.

Deep learning dominates the third artificial intelligence boom, thanks to ABC three points:

  • Algorithm (algorithm): the deep neural network training algorithm is becoming more and more mature, and the recognition accuracy is getting higher and higher;
  • B ig data (big data): There is enough big data for neural network training;
  • Computing (computing power): The computing power of the deep learning processor chip is relatively strong.

The history of neural network development

1943: Proposed model MP neuron model

McCulloch-Pitts neuron model

As the origin of artificial neural network, MP model has created a new era of artificial neural network and laid the foundation of neural network model.

  • American psychologist McCulloch and mathematician Pitts proposed a mathematical model that simulates human neuron networks for information processing

    image-20220505103124474

  • The characteristics of neurons: multiple input and single output; synapse (the place where nerve impulses are transmitted) has both excitatory and inhibitory properties; it can be time-weighted and space-weighted; it can generate pulses; pulses are transmitted; nonlinear

    image-20220505103118107

  • A simple linear weighting method is used to simulate this process, where l is the input, W is the weight, and the weighted sum is used as the output after passing through a threshold function.

    image-20220505102025012

1949: Hebb Hypothesis

In The Organization of Behavior, published in 1949, Hebb presented his neuropsychological theory.

image-20220505103601781

Hebb hypothesis : when the axon of cell A is close enough to stimulate cell B, and stimulate B repeatedly or continuously, then some kind of growth process or metabolic reaction will occur in these two cells or one cell, Increase the stimulating effect of A on cell B;

image-20220505103556788

Hebb's rule is consistent with the "conditioned reflex" mechanism, which laid the foundation for the future neural network learning algorithm and has great historical significance.

1958: Rosenblatt Perceptron (Perceptron) Algorithm

  • In 1958, a psychologist Rosenblatt invented the perceptron, so it was named ** Rosenblatt perceptron ** by later generations.

    image-20220505103903155

  • For the first time, Rosenblatt used the MP model to classify the input multidimensional data, and used the gradient descent method to automatically learn and update the weights from the training samples;

    image-20220505103912904

  • In 1962, the method was proved to converge eventually, and the theoretical and practical effects caused the first wave of neural networks

The proposal of the perceptron has aroused the interest of a large number of scientists in the study of artificial neural networks, which is a milestone in the development of neural networks.

1969: Questioning of the XOR problem

  • In 1969, American mathematicians and artificial intelligence pioneers Minsky and Papert proved in their works that the perceptron is essentially a linear model, which cannot solve the simplest XOR (or) problem, "linear inseparable problem";

    image-20220505104026146

  • The death sentence of the perceptron was pronounced, and the research on neural networks has also stagnated for more than 10 years ( entering the first cold winter ), and people's research on neural networks has also stagnated for nearly 20 years.

image-20220505104034862

1982-1984: The development stage of the cold winter period

  • The fruit of truth always favors scientists who can persist in research. Although the research of artificial neural network ANN has fallen into an unprecedented trough, there are still a few scholars devoted to the research of ANN.
  • In 1982, the famous physicist John Hopfield invented the Hopfield neural network . Hopfield neural network is a recurrent neural network combining memory system and binary system. The Hopfield network can also simulate human memory. According to the selection of the activation function, there are two types: continuous and discrete, which are used to optimize calculation and associative memory respectively. However, due to the defect that it is easy to fall into the local minimum, the algorithm did not cause a big sensation at that time.
  • In 1984, Hinton collaborated with young scholars such as Shenovsky to propose a large-scale parallel network learning machine, and clearly proposed the concept of hidden units. This learning machine was later called a Boltzmann machine . Using the concepts and methods of statistical physics, they first proposed a learning algorithm for multi-layer networks, called the Boltzmann machine model.

insert image description here

1986-1989: The introduction of MLP and BP algorithms

  • In 1986, Rumelhart, Hilton and others invented the algorithm suitable for multi-layer perceptron (Multi-Layer Perceptron, MLP) and error back propagation algorithm (Back Propagation, BP), and used the Sigmoid function for nonlinear mapping, effectively solving the problem of The problem of nonlinear classification and learning. BP algorithm caused the second upsurge of neural network

image-20220505104330691

  • In 1989, Robert Hecht-Nielsen proved the universal approximation theorem of MLP, that is, for a continuous function f in any closed interval, the BP network with a hidden layer can be used to approximate the theorem. Network researchers.

image-20220505104322588

  • In 1989, LeCun invented the convolutional neural network-LeNet and used it for digital recognition, and achieved good results, but it did not attract enough attention at the time.

    image-20220505104255880

  • After 1989, because no particularly outstanding method was proposed, and NN has been lacking the corresponding strict mathematical theory support, the craze of neural network receded.

  • The second cold winter came in 1991. The BP algorithm was pointed out that there was a gradient disappearance problem , that is, in the process of the backward transmission of the error gradient, the gradient of the back layer was superimposed on the front layer in a multiplicative manner. Due to the saturation characteristics of the Sigmoid function, the back layer The gradient is already small, and the error gradient is almost 0 when it is passed to the front layer, so the front layer cannot be effectively learned. This discovery made the development of NN worse at this time, and this problem directly hindered the further development of deep learning . .

  • In 1997, the LSTM model was invented. Although the model has outstanding characteristics in sequence modeling, it has not attracted enough attention because it is in the downhill period of NN.

    image-20220505104540417

1986-2006: Statistical Learning Takes Mainstream

  • In 1986, improved decision tree methods such as ID3, ID4, and CART appeared one after another, and it is still a very commonly used machine learning method. This method is also representative of symbolic learning methods
  • In 1995, the SVM support vector machine algorithm was invented by statisticians V.Vapnik and C.Cortes. There are two characteristics of this method: it is derived from a very perfect mathematical theory (statistics and convex optimization, etc.), and it is in line with human intuitive feeling (maximum interval). However, the most important thing is that the method achieved the best results at that time on the problem of linear classification,
  • In 1997, AdaBoost was proposed. This method is the representative of PAC (Probably Approximately Correct) theory in machine learning practice, and it also gave birth to the category of integrated methods. This method integrates a series of weak classifiers to achieve the effect of a strong classifier.
  • In 2000, Kernel SVM was proposed. The kernelized SVM successfully solved the problem of nonlinear classification by mapping the linearly inseparable problem of the original space into a linearly separable problem of high-dimensional space through Kernel in an ingenious way, and the classification The effect is very good. So far, the ANN era has come to an end.
  • In 2001, random forest was proposed, which is another representative of the integrated method. The theory of this method is solid, and it can suppress the overfitting problem better than AdaBoost, and the actual effect is also very good.
  • In 2001, a new unified framework-graphic model was proposed, which tried to unify the chaotic methods of machine learning, such as naive Bayesian, SVM, hidden Markov model, etc., to provide a unified description for various learning methods frame

Various shallow machine learning models such as the birth of support vector machine algorithm (SVM algorithm) have been proposed. SVM is also a supervised learning model, which is applied to pattern recognition, classification and regression analysis. Support vector machines are based on statistics and are significantly different from neural networks . The introduction of algorithms such as support vector machines has once again hindered the development of deep learning.

1995: Statistical Learning - SVM

  • V.Vapnik and C.Cortes invented SVM.

image-20220505104906824

  • It is a two-class classification model, and its basic model is defined as a linear classifier with the largest interval in the feature space, that is, the learning strategy of the support vector machine is to maximize the interval, which can finally be transformed into the solution of a convex quadratic programming problem. .

image-20220505104912901

Advantages of Statistical Learning

  • SVM uses the inner product kernel function instead of nonlinear mapping to high-dimensional space;

    image-20220505105331387

  • SVM is a novel small-sample learning method with a solid theoretical foundation;

  • The final decision function of SVM is only determined by a small number of support vectors, and the complexity of calculation depends on the number of support vectors, not the dimension of the sample space, avoiding the "curse of dimensionality";

    image-20220505105317763

  • A small number of support vector decision-making, simple and efficient, good robustness (good stability).

  • G. Hinton et al. proposed Deep Belief Network , which is a generative model that enables the entire neural network to generate training data with maximum probability by training the weights between its neurons.

  • Use unsupervised greedy layer-by-layer method to pre-train to obtain weights, do not rely on experience to extract data features, and automatically refine through the underlying network

    image-20220505105420171

2006-2017: Rising stage

  • ** In 2006, Jeffrey Hinton and his student Ruslan Salahudinov formally proposed the concept of deep learning. **In an article published in the world's top academic journal "Science", they gave a detailed solution to the "gradient disappearance" problem - training the algorithm layer by layer through an unsupervised learning method, and then using supervised reverse The propagation algorithm is tuned. The proposal of this deep learning method immediately aroused great repercussions in the academic circle. Stanford University, New York University, and the University of Montreal in Canada have become important research centers for deep learning, thus starting a wave of deep learning in academia and industry .
  • In 2011, the ReLU activation function was proposed, which can effectively suppress the gradient disappearance problem. Since 2011, Microsoft has made a major breakthrough by applying DL to speech recognition for the first time. Speech recognition researchers at Microsoft Research and Google have successively used deep neural network DNN technology to reduce the error rate of speech recognition to 20% to 30%, which is the biggest breakthrough in the field of speech recognition for more than ten years.

QQ picture 20220505111217

  • In 2012, DNN technology achieved amazing results in the field of image recognition , reducing the error rate from 26% to 15% in the ImageNet evaluation. In this year, DNN was also applied to the DrugActivity prediction problem of pharmaceutical companies and achieved the best results in the world. In 2012, in the famous ImageNet image recognition competition, in order to prove the potential of deep learning, Jeffrey Hinton's research team participated in the ImageNet image recognition competition for the first time. It won the championship in one fell swoop through the constructed CNN network AlexNet, and crushed The classification performance of the second place (SVM method). ** It is also because of this competition that CNN has attracted the attention of many researchers. The emergence of deep learning algorithms in the world competition has once again attracted the attention of academia and industry in the field of deep learning.
  • With the continuous advancement of deep learning technology and the continuous improvement of data processing capabilities, in 2014, Facebook's DeepFace project based on deep learning technology has achieved an accuracy rate of more than 97% in face recognition, which is almost the same as that of human recognition. no difference. This result once again proves that the deep learning algorithm is the best in image recognition.
  • **In March 2016, AlphaGo (based on deep learning algorithm) developed by Google’s DeepMind company competed with Go world champion and professional nine-dan Go player Li Sedol in a human-machine battle with a total score of 4 to 1. ; At the end of 2016 and the beginning of 2017, the program used "Master" (Master) as the registered account on the Chinese chess website to play fast chess against dozens of Chinese, Japanese and Korean Go masters, and there was no loss in 60 consecutive games.
  • In 2017, AlphaGo Zero, an upgraded version of AlphaGo based on the reinforcement learning algorithm, was born . It adopts the learning mode of "starting from scratch" and "teaching without a teacher", easily defeating the previous AlphaGo with a score of 100:0. In addition, in this year, the relevant algorithms of deep learning have achieved remarkable results in many fields such as medical care, finance, art, and driverless driving. Therefore, some experts regard 2017 as the year of the most rapid development of deep learning and even artificial intelligence.

QQ picture 20220505111338

The Future of ANNs (Neural Networks)

forever Or disappear?

  • Entering the 21st century, looking at the development of machine learning, the research hotspots can be briefly summarized as manifold learning from 2000 to 2006 , sparse learning from 2006 to 2011 , and deep learning from 2012 to the present . Which machine learning algorithm will become a hot spot in the future?

  • Wu Enda once said, "After deep learning, transfer learning will lead the next wave of machine learning technology." But in the end, what is the next hot spot in machine learning, who can say for sure.

    Transfer learning: Transfer learning is a method of machine learning in which a pre-trained model is reused in another task.

5-year contract at Bell Labs

In 1995, two interesting bets came from Bell Labs. The two sides of the bet are: Larry Jackel, former head of Bell Labs, and Vladimir Vapnik, one of the creators of Support Vector Machine.

The first bet: Larry Jackel believes that by 2000 at the latest, we will have a mature theoretical explanation of why neural networks work.

The second bet: Vladimir Vapnik believes that by 2000, everyone will no longer use the structure of neural networks. (After all, people are one of the creators of support vector machines, so they naturally recognize support vector machines more)

And what about the result of the bet? - Both of them lost. We still don't have a solid explanation of why neural networks work so well; meanwhile, we're still using neural network architectures.

Thoughts on Artificial Intelligence

All in all, under the banner of artificial intelligence, different people are actually doing different things: some build brain models, some simulate human behavior, some develop computer application fields, some design new algorithms, and some summarize the laws of thinking . While these studies are all valuable and related, they are not substitutes for each other, and confusing them can lead to confusion.

  • Special-purpose artificial intelligence has indeed made breakthrough progress, but on the other hand, the research and application of general artificial intelligence still has a long way to go. To achieve a huge breakthrough in general artificial intelligence, we still need to do our best.

  • At present, artificial intelligence has intelligence but not wisdom, IQ but not EQ, can calculate but not calculate, has specialists but not generalists.

    ​ ——Tan Tieniu, "Thoughts on the Development of Artificial Intelligence". 2016 China Artificial Intelligence Conference (CCAl 2016)

words written on the back

If you think this series of articles is helpful to you, don't forget to like and follow the author! Your encouragement is my motivation to continue to create, share and add more! May we all meet at the top together. Welcome to the author's official account: "01 Programming Cabin" as a guest! Follow the hut, learn programming without getting lost!

Guess you like

Origin blog.csdn.net/weixin_43654363/article/details/124586608