"Wu Qiulin's Book Donation Event|Second Issue" "ChatGPT Principles and Practice"

The Thousand Model War is in full swing, and understanding ChatGPT thoroughly is the key to victory!

Insert image description here

Statement : The book donation activity is a cooperation between the blogger and the publishing house, and is an exclusive benefit only for fans.


Book in this issue : "ChatGPT Principles and Practice"
How to participate : Follow the blogger in his comment area: Like | Collection |
Leave a message in the comment area : "Embrace the new era of AI"
activity deadline: : September 18, 2023
Number of gifts : : 3~5 books


The winning list will be dynamically updated at 8pm the next day after the deadline! After winning the prize, the blogger will notify you via private message | Failure to reply within three days will be regarded as | Automatically giving up

Insert image description here

Explain the technical principles of ChatGPT in 4 dimensions and reveal the mysterious technical black box of ChatGPT!

1. Write in front

  After the ChatGPT model came out on November 30, 2022, it immediately caused an uproar around the world. Both AI practitioners and non-practitioners are talking about ChatGPT’s impactful interactive experience and amazing generated content. This has made the general public realize the potential and value of artificial intelligence again. For AI practitioners, the ChatGPT model has become an expansion of ideas. Large models are no longer just toys for rankings. Everyone recognizes the importance of high-quality data and firmly believes that "as much artificial intelligence as there is, there will be as much intelligence as there are." ".

The ChatGPT model is too good. On many tasks, even zero-sample or few-sample data can achieve SOTA results, causing many people to turn to the research of large models.

Not only Google has proposed the Bard model to benchmark ChatGPT, but many large Chinese models have emerged in China, such as Baidu's "Wen Xin Yi Yan", Alibaba's "Tongyi Qianwen", SenseTime's "RiRiXin", Zhihu "Zhihaitu AI", Tsinghua University's "ChatGLM", Fudan University's "MOSS", Meta's "Llama1&Llama2", etc.

After the advent of the Alpaca model, it was proved that although the model with 7 billion parameters cannot achieve the effect of ChatGPT, it has greatly reduced the computing power cost of large models, making it possible for ordinary users and ordinary enterprises to use large models. The data issues that have been emphasized before can be obtained through the GPT-3.5 or GPT-4 interface, and the data quality is also quite high. If you only need a basic effect model, it is not so important whether the data is accurately calibrated again (of course, to obtain better effects, you need more accurate data)

2. Tansformer architecture model

The essence of pre-trained language models is to obtain better results in downstream subtasks by learning universal expressions of language from massive amounts of data. As model parameters continue to increase, many pre-trained language models are also called large language models (Large Language Model, LLM). Different people have different definitions of "big". It is difficult to say how many parameter models are large language models. Usually, there is no forced distinction between pre-trained language models and large language models.

Insert image description here

Pre-trained language models are generally divided into Encoder-only architecture models, Decoder-only architecture models and Encoder-Decoder architecture models according to the underlying model network structure. Among them, only Encoder architecture models include but are not limited to BERT, RoBerta, Ernie, SpanBert, AlBert, etc.; only Decoder architecture models include but are not limited to GPT, CPM, PaLM, OPT, Bloom, Llama, etc.; Encoder-Decoder architecture models include but are not limited to Limited to Mass, Bart, T5, etc.

Insert image description here

3. ChatGPT principle

  The overall process of ChatGPT training is mainly divided into three stages, the pre-training and prompt learning stage, the result evaluation and reward modeling stage, and the reinforcement learning self-evolution stage; the three stages have a clear division of labor, realizing the model from the imitation period, the discipline period, and the autonomy stage transition

Insert image description here

In the first stage of imitation, the model focuses on learning various command-based tasks. The model at this stage has no self-discrimination awareness, and is more about imitating artificial behavior. It makes its behavior itself through continuous learning of human annotation results. Has a certain degree of intelligence. However, mere imitation often turns the machine's learning behavior into a toddler.

In the second phase of the discipline period, the optimization content has undergone a directional change, changing the focus from educating the content of machine answers to educating the quality of machine answers. In the first stage, the focus is on hoping that the machine will use input X to imitate and learn to output Y', and strive to make Y' consistent with the originally labeled Y. Then, in the second stage, the focus is to hope that when multiple models output multiple results (Y1, Y2, Y3, Y4) for X, they can judge the pros and cons of multiple results by themselves.

When the model has a certain judgment ability, it is considered that the model has completed the second stage of learning and can enter the third stage - the autonomous period. In the autonomous period, the model needs to complete its self-evolution through left-right interaction, that is, on the one hand, it automatically generates multiple output results, on the other hand, it judges the quality of different results, and evaluates the model differences based on the effects of different outputs, and optimizes and improves them. Automatically generate the model parameters of the process, thereby completing the self-reinforcement learning of the model.

To sum up, the three stages of ChatGPT can also be compared to the three stages of human growth. The purpose of the imitation stage is to "know the principles of nature", the purpose of the discipline stage is to "distinguish right from wrong", and the purpose of the autonomy stage is to "understand all things"

4. Prompt learning and the emergence of large model capabilities

  After the release of the ChatGPT model, it became popular all over the world for its smooth conversational expression, strong context storage, rich knowledge creation and its ability to comprehensively solve problems, refreshing the public's understanding of artificial intelligence. Concepts such as Prompt Learning, In-Context Learning, and Chain of Thought (CoT) have also entered the public eye. There is even a profession called prompt engineer on the market, which specializes in writing prompt templates for specified tasks.

Hint learning is considered by most scholars to be the fourth paradigm of natural language processing after feature engineering, deep learning, and pre-training + fine-tuning. As the parameters of the language model continue to increase, the model has also emerged with capabilities such as context learning and thought chaining. Without training the language model parameters, it is possible to achieve better results in many natural language processing tasks with just a few demonstration examples. score

4.1 Tips for learning

  Prompt learning is to append additional prompt information as new input to the original input text, convert the downstream prediction task into a language model task, and convert the prediction results of the language model into the prediction results of the original downstream task

Taking the sentiment analysis task as an example, the original task is to determine the emotional polarity of the text based on the given input text "I love China". Prompt learning is to add additional prompt templates to the original input text "I love China", for example: "The emotion of this sentence is {mask}." The new input text "I love China" is obtained. The emotion of this sentence is {mask}." Then use the mask language model task of the language model to predict the {mask} tag, and then map the predicted Token to the emotional polarity label, and finally achieve emotional polarity prediction.

4.2 Contextual learning

  Context learning can be regarded as a special case of prompt learning, that is, the demonstration example is regarded as part of the manually written prompt template (discrete prompt template) in prompt learning, and the model parameters are not updated.

The core idea of ​​contextual learning is learning through analogy. For an emotion classification task, first extract some demonstration examples from the existing emotion analysis sample library, including some positive or negative emotional texts and corresponding labels; then compare the demonstration examples with the emotional text to be analyzed. Spliced ​​and fed into a large language model; finally the emotional polarity of the text is obtained through learning analogies to demonstration examples

Insert image description here

This learning method is also closer to the decision-making process of human beings after learning. By observing how others handle certain events, when you encounter the same or similar events, you can easily and well solve them.

4.3 Thought chain

  In an era where large-scale language models are rampant, it has completely changed the paradigm of natural language processing. As model parameters increase, for example: sentiment analysis, topic classification and other System-1 tasks (tasks that humans can complete quickly and intuitively), better results can be obtained even under few-sample and zero-sample conditions. But for System-2 tasks (tasks that humans need to think slowly and thoughtfully to complete), such as logical reasoning, mathematical reasoning, and common sense reasoning, even when the model parameters increase to hundreds of billions, the effect is not ideal, and Simply increasing the number of model parameters does not bring substantial performance improvement.

Google proposed the concept of Chain of thought (CoT) in 2022 to improve the ability of large language models to perform various reasoning tasks. The thinking chain is essentially a discrete prompt template. The main purpose of the thought chain is to use the prompt template to enable large language models to imitate the human thinking process and provide step-by-step reasoning basis to derive the final answer. The reasoning basis for each step is composed of The collection of sentences is the content of the thought chain.

Thinking chain actually helps large language models decompose a multi-step problem into multiple intermediate steps that can be solved individually, instead of solving the entire multi-hop problem in one forward pass.

Insert image description here

5. Industry references and suggestions

5.1 Embrace change

  Unlike other fields, the AIGC field is currently one of the most rapidly changing fields. Taking the week from March 13, 2023 to March 19, 2023 as an example, we have experienced Tsinghua University releasing the ChatGLM 6B open source model, openAI releasing the GPT4 interface, Baidu Wenxinyiyan holding a press conference, and Microsoft launching Office in conjunction with ChatGPT. Combined with a series of major events such as the brand new product Copilot.

These events will affect the direction of industry research and trigger more thinking. For example, should the next technical route be based on open source models or pre-train new models from scratch? How many parameters should be designed? Copilot is ready, how should application developers of the office plug-in AIGC respond?

Even so, it is still recommended that practitioners embrace changes, quickly adjust strategies, and use cutting-edge resources to accelerate the realization of their tasks.

5.2 Clear positioning

  You must be clear about your goals for segmenting the track, such as whether to do the application layer or the base optimization layer, whether to do the C-end market or the B-end market, whether to do industry vertical applications or general tool software. Don't be too ambitious, seize the opportunity and "cut the cake accurately".

Having a clear positioning does not mean that you will not hit the wall and never turn back. It means understanding your own purpose and significance.

5.3 Compliance and controllability

  The biggest problem of AIGC is the uncontrollability of the output. If this problem cannot be solved, its development will face a big bottleneck and it will not be widely used in the B-side and C-side markets. In the product design process, attention needs to be paid to how to integrate the rule engine, strengthen the reward and punishment mechanism, and appropriate manual intervention. Practitioners should focus on the copyright, ethical and legal risks involved in AIGC-generated content.

5.4 Experience accumulation

  The purpose of experience accumulation is to establish one's own barriers. Don’t pin all your hopes on a single model. For example, we once designed the product in a plain text format to seamlessly integrate with ChatGPT, but the latest GPT4 already supports multi-modal input. We should not be discouraged, but should quickly embrace changes and use the previously accumulated experience (data dimension, prompt dimension, interaction design dimension) to quickly complete product upgrades to better cope with new scenarios and interaction forms.

We hope practitioners will refer to the above suggestions.

Although there are many bubbles under the wave of AIGC, as long as we are determined to embrace change, always know the distance we want to reach, seriously face the risks and crises around us, and continue to exercise our abilities in actual combat, I believe that one day, We will reach the destination we long for.

Written by BAT senior AI experts and large model technology experts, and highly recommended by many experts such as MOSS system leader Qiu Xipeng! Systematically sort out and deeply analyze ChatGPT's core technology, algorithm implementation, working principles, and training methods, and provide a large amount of code and annotations. Not only does it teach you how to migrate and privatize large models, but it also teaches you step by step how to build your own ChatGPT with zero foundation!

Insert image description here

  Okay, here it is time to say goodbye to everyone. Creation is not easy, please give me a thumbs up and leave. Your support is the driving force for my creation. I hope to bring you more high-quality articles.

Guess you like

Origin blog.csdn.net/qiulin_wu/article/details/132851355