[Original] Understanding ChatGPT's Introduction to Machine Learning

If you want to learn this content coherently, please read the previous articles:

[Original] AIGC's ChatGPT advanced usage skills

【Original】AIGC Mainstream Product Introduction

What is AIGC

AIGC - AI Generated Content (AI Generated Content), corresponding to our past is mainly UGC (User Generated Content) and PGC (Professional user Generated Content).

AIGC means that all output content is generated by AI robots. The main difference is that in the past, ordinary users and professional users (people) in a certain field produced content. AIGC mainly relies on artificial intelligence (non-human) to generate content. , this is the core meaning of AIGC.

(Copyright identification: UGC and PGC have the concept of copyright, and the copyright belongs to the person responsible for generating the content. Currently, AIGC US regulations believe that there is no concept of copyright, that is, the content does not belong to the caller, nor does it belong to the AI ​​machine, so there is no This matter belongs to copyright.)

What content can AIGC generate

At present, AIGC can mainly generate text content and image content (currently there are some products for video generation, but they are not as mature as text and image generation), so we mainly focus on the introduction of AIGC for text and images.

In terms of text content, AIGC can mainly interact in the form of Q&A (question answering), and can produce and output content that meets human expectations according to the "questions" that humans want.

Generally, we can think of AI as an all-knowing and omnipotent "advanced human", using "text AIGC", you can ask it questions (Prompt), and then it will answer accordingly. All questions and answers can involve all aspects, including but not limited to encyclopedia knowledge/creative copywriting/novel script/code programming/translation conversion/thesis writing/education and teaching/guidance advice/chat companionship, etc. You need to think about it all, and you can understand that it is a "baixiaosheng" with knowledge of the whole earth, and you can ask it or communicate with it about anything.

For example, we use the famous ChatGPT to ask questions:

For "Picture AIGC", you may have countless ideas in your mind, but you can't paint, and you can't turn the Ideas in your mind into real pictures. Then, "Picture AIGC" can help you follow what you want in your mind. You tell it what you want, and then it can help you draw it for you in the form of picture painting, allowing you to turn your "creativity" into picture reality at once.

For example, we use the very useful "picture AIGC" tool Midjourney to draw:

Basic working principle of AIGC

The bottom layer of AIGC mainly relies on AI technology. The essence of AI technology is to enable machines to have the same intelligence as humans (Artificial Intelligence), so it is necessary to allow machines to learn and think like humans. Therefore, most of the underlying technologies that implement AI are called artificial intelligence. "Machine Learning" (Machine Learnin) technology.

There are many application scenarios for machine learning technology. For example, face recognition (mobile phone unlocking/Alipay payment/access control unlocking, etc.), speech recognition (Xiaoai classmates/Xiaodu/Siri), face-changing (anchor beauty, etc.) are very commonly used now. beauty/beauty camera), map navigation, weather forecast, search engine, NLP (Natural Language Processing), automatic driving, robot control, AIGC, etc.

How Machines Learn

Machine learning can be simply understood as the process of simulating human learning. Let's take a look at how machines simulate human learning.

Let's look at the so-called "machine learning":

For human learning, the things we see and encounter are our "data" (corpus), and then we pass "learning summary" (learning algorithm), and finally become "knowledge experience wisdom" (model ), and when we encounter something, we will call these "knowledge experience methodology" to make corresponding response decision-making actions (predictive reasoning);

For machine learning, a large amount of "corpus" is input to it (seeing things encountered), and then through machine learning algorithms (summarization and induction to extract similar points), and finally a "model" (knowledge experience methodology) is formed, and then When encountering some decisions that need to be judged, we will give the "model" the things to be judged and decided, and then tell us the output results (reasoning and speculation results);

From the abstraction level, we will find that, in essence, the intrinsic nature of "human learning" and "machine learning" is quite similar.

Let's take a look at the process of machine learning in a computer:

The core steps are: "training data ➜ training algorithm ➜ model ➜ prediction ➜ output results", where the final output is the "model" (Model, model file), and then mainly the pre-"training model" and the post- "Model prediction", and then generate corresponding results.

We can simply understand the above process as: the "model" is a puppy, and the breeder is the "training algorithm". The dog will learn some skills (models), and once learned, the puppy can go out to perform, and the process of performing is prediction.

So we will see that if there are more features (knowledge experience) in the "model", it will be more accurate in the "prediction" stage. If the model is smaller, or the feature data in the middle is less, the accuracy of the final prediction result may be higher. will decrease. (Similarly, the more things a person encounters, the more experience he can sum up. As the saying goes, "There are no white roads and no white pits in life" is probably this logic)

The development of machine learning

Machine learning technology has mainly experienced three major technological eras from the rise to deep learning. The first is the rising era, then the traditional machine learning era, and finally the neural network-based deep learning era. The following is simply done according to my personal understanding A classification of developmental stages.

Rising stage : The perceptron model (MCP) was born in 1943. A psychologist and a mathematical logician proposed the concept of artificial neural network and the mathematical model of artificial neuron, which created the research era of artificial neural network. . Then from the 1960s to the 1980s, there were machine learning concepts and pattern recognition, which belonged to the rise and exploration stage of the entire research. At this stage, explorations in various directions were going on, and a hundred flowers bloomed.

The first stage : traditional machine learning (Machine Learnin), since the first technical learning seminar was held in 1980, although there is also research on neural networks at the same time, it can be simply understood that it is mainly based on mathematics and statistical analysis. Methods of machine learning, especially between 1990 and 2001, have undergone great development from theory to practice. From this time period until 2006, the traditional machine learning that is mainly popular in the information industry includes the traditional hidden Markov model (HMM), conditional random fields (CRFs), maximum entropy model (MaxEnt), Bboosting, support vector machine (SVM) , Bayesian (Bayes), etc., the specific practice includes linear regression, logistic regression, SVM, decision tree, random forest, naive Bayesian and other landing algorithms. The causal logic and intermediate calculation process of these algorithms are clear and clear, basically clear and credible, the shortcoming is that the final effect has an upper limit, and the final "smart" effect may sometimes not be enough.

The second stage V1 : "Deep Learnin". In 2006, Hinton, the godfather of machine learning, published a paper on deep neural network, which officially opened the stage of "deep learning" based on neural network. It can be simply regarded as "deep learning". It is another route of traditional machine learning. Its main difference is that it has a different route in terms of "learning strategy". Traditional machine learning mainly relies on the method of "mathematical statistical analysis", and the process results can be derived; deep learning is mainly also It relies on allowing computers to simulate the human brain in the same way as neural network connections.

The second stage V2 : Transformer model (Transformer model), the Attention mechanism was proposed in 2015, and Google published the paper "Attention Is All You Need" in 2017. Based on this, the Transformer architecture was proposed, which is based on the encoder-decoder architecture and discards The traditional RNN and CNN models are only realized by the Attention mechanism (attention mechanism), and because the encoder side is calculated in parallel, the training time is greatly shortened. The Transformer model is widely used in the field of NLP, machine translation, text summarization, question answering systems, etc. In recent years, the more mainstream Bert and GPT models are based on the Transformer model.

Let's take a look at the basic development history of deep learning:

The difference between machine learning and deep learning

Conventional machine learning is generally called "traditional machine learning" or "shallow machine learning", mainly to correspond to the concept of "deep learning". Deep learning is not the same as traditional machine learning, so it is mainly used to define the neural network of different network framework parameter layers, so there are many neural network structures, including unsupervised pre-trained networks (Unsupervised Pre-trained Networks), convolutional Convolutional Neural Networks, Recurrent Neural Networks, Recursive Neural Networks, etc.;

The neural network is called "deep learning" mainly because of the number of layers in the so-called neural network. 1-2 layers are called shallow neural networks, and more than 5 layers are called deep neural networks, also known as deep learning.

Among them, the main convolutional network (CNN - Convolutional Neural Networks), cyclic neural network (RNN - Recurrent Neural Networks) + recursive neural network (RNN - Recursive Neural Networks), long short-term memory RNN (LSTM - Long short- term memory) and the Transformer framework that adds the Attention mechanism to solve some problems in LSTM/RNN.

Deep learning is better than traditional machine learning in computer vision (CV, such as image recognition), natural language processing (NLP), autonomous driving, robot control, etc.

When the scale of training data is relatively small, the performance of traditional machine learning algorithms is not bad, but when the data increases, the effect of traditional machine learning does not increase, and there will be a critical point; but for deep learning, the more data, the better the effect. good. So it is also a process of gradually replacing "traditional machine learning" with "deep learning".

Performance comparison chart of traditional machine learning and deep learning:

The processing difference between traditional machine learning and deep learning: (traditional machine learning features are clear, deep learning internal features are black boxes)

The general working mechanism of the neural network used in deep learning is to simulate the working mechanism of the human brain, such as the process of seeing an object through our eyes:

Let's take a look at the learning process based on neural network "deep learning":

From the above neural network working process, we can see that the whole process of "deep learning" based on neural network is basically completely different from traditional machine learning.

Another difference is that traditional machine learning can basically use traditional CPU operations during training, but in terms of deep learning, because of the large number of neural network layers and the large amount of calculations, it is generally necessary to use GPU or AI computing chips ( AI card) to perform calculations, which is what we often call "computing power".

The computational cost of deep learning in terms of large-scale data calculation is astonishing. Taking ChatGPT as an example, it is rumored that the calculation cost about 10,000 NVIDIA A100 GPUs. The current sales price of JD A100 cards is about RMB 100,000. ChatGPT’s roughly estimated cost of training computing power is 1 billion yuan. According to the data released by ChatGPT, a large model training needs about 12 million US dollars. Therefore, in addition to competing algorithms, computing power is an important decisive factor.

Classification of Neural Networks

Based on the logic of deep learning above, let's take a macro look at what the neural network of deep learning includes:

Deep learning can be understood as: the classification method is mainly based on the "machine learning strategy" is the "neural network" strategy, and the "learning method" is mainly produced by combining different scenarios such as supervised learning and unsupervised learning (may also include reinforcement learning). The "machine learning" approach is called "deep learning".

This article outlines the basic concepts of AIGC and machine learning, and has a basic understanding to facilitate the understanding of various applications and principles of artificial intelligence (AI) that rely on machine learning and deep learning.

What replaces you is not AI, but someone who knows AI better than you and can use AI better!

##End##

If you want to pay attention to more technical information, you can pay attention to the "Dark Night Passerby Technology" public account

Guess you like

Origin blog.csdn.net/heiyeshuwu/article/details/130355937