ChatGPT4 is also here, can the big language model be expected in the future? Being present is key!

On Tuesday (March 14), local time, the artificial intelligence research company OpenAI announced the latest version of its large language model-GPT-4 . "GPT-4 outperforms the vast majority of humans on many specialized tests," the company said. In internal evaluations, GPT-4 was 40 percent more likely to generate correct responses than GPT-3.5, and GPT -4 is multimodal, supporting both text and image input functions.

OpenAI says: GPT-4 is "bigger" than previous versions! This means it has been trained on more data and has more weights in the model file, which makes it more expensive to run.

 

OpenAI said it has worked with several companies to incorporate GPT-4 into their products, including Duolingo, Stripe, and Khan Academy. The GPT-4 model will also be provided to subscribers of the paid version of ChatGPT Plus in the form of an API.

It has to be admitted that the last 10 years have been the golden period for the fastest development of NLP technology and business scenarios. The technical system of NLP itself has been restructured, and the business areas affected have also continued to expand.

[1] Changes in the technical system

·Demise of intermediate tasks

Intermediate tasks refer to some phased tasks that are not directly oriented to the final goal, but to solve the final goal. Because of the development of neural networks, unique intermediate tasks such as grammatical analysis, part-of-speech tagging, and word segmentation have been almost ignored.

·All you need is Money

Completing an NLP task now requires not only technology, but also computing power, data and other resources. Behind this is a huge investment of funds.

【2】Expansion of usage scenarios

·Search, promotion and other content link fields

The most successful commercial application scenarios of NLP are search, recommendation, advertising, and the development of technology has spawned a series of giants (Google, Baidu, Byte)

· Rich human-computer interaction

Various chatbots, voice assistants, and other human-computer interaction scenarios are becoming more mature.

· Changes in the content field

ChatGPT will subvert the entire content ecosystem, and the focus of content-based companies (such as Zhihu and Weibo) will shift from content distribution to content production. In the future, the Internet will be filled with a large amount of machine-generated content, which will bring great challenges to supervision!

【3】Future development

·Large model era

The emergence of ChatGPT marks that the large model has broken through the technical ceiling, making the large model a more firm development route for this technology.

·Small model era

Large-scale models undoubtedly require a lot of resources for startups and small and medium-sized enterprises. The lightweight pre-training model gives a new direction of thinking.

 

RLHF

RLHF (Reinforcement Learning from Human Feedback): That is, using reinforcement learning methods to directly optimize language models using human feedback signals. It is the root cause of ChatGPT's excellent results.

·Long-term development

In the past few years, the AI ​​generation model based on the prompt paradigm has achieved great success, and many interesting AI applications have been born, such as AI writing novels, AI writing codes, AI drawing and even AI making videos.

·There is a problem

In order to describe the overall quality of the model output (rather than a single word), people often use evaluation indicators such as BLEU or ROUGH to describe the similarity between the model output and human preferences, but this is only at the evaluation level. When the model is training It is impossible to see the real preferences of these human beings.

·solution

Using reinforcement learning methods, language models are directly optimized using human feedback signals.

 

Step①

·Select a pre-trained language model as the initial model. For example, OpenAI chooses GPT-3; DeepMind chooses the Gopher model.

·Manually crafted corpus to guide the model: detoxification, authenticity, human preferences.

The model is fine-tuned on artificial corpus.

Step②

Construct a reward model (can be trained or randomly initialized) to learn human subjective preferences.

·Pick another data set: Anthropic: chat tool; OpenAI: users who call GPT API.

· Manually sort the results of initialization (such as GPT3) output.

The reward model learns the results of human ranking in order to learn human preferences

Advantages and disadvantages of ChatGPT

The effect of InstructGPT/ChatGPT is very impressive. After the introduction of manual annotation, the correctness of the model's "values" and the "authenticity" of human behavior patterns have been greatly improved.

Model advantages:

·Authenticity & harmlessness

InstructGPT/ChatGPT introduces different labelers to write prompts and sort the generated results, and it is still fine-tuned on top of GPT-3, which allows us to have higher rewards for more realistic and harmless data when training the reward model.

· Harmless

GPT-3 has strong coding capabilities, and APIs based on GPT-3 have accumulated a large amount of coding codes. Moreover, some OpenAI internal employees participated in the data collection work.

Through a large amount of coding-related data and manual annotation, it is not surprising that the trained InstructGPT/ChatGPT has a very strong coding ability.

· Relevance

Regardless of whether the answer is correct or not, it can basically be relevant, which shows that ChatGPT has been very successful in understanding human language

 

There are problems:

· Excessive interpretation

Because the labeler tends to give higher rewards to long output content when comparing generated content.

·Easy to be induced

Harmful instructions may output harmful answers: For example, InstructGPT/ChatGPT will also give an action plan to the "AI plan to destroy human beings" proposed by users.

· Absurdity

It is likely to be limited by the limited correction data, or the misleading of the supervised task, resulting in the unrealistic content it generates.

The impact of Chat on content production and content distribution is subversive, including AIGC and search engines.

 

The emergence of ChatGPT has greatly promoted the development of AIGC. In the future, there will be a large number of content produced by AIGC on the Internet. On the one hand, a new track and outlet will be created. On the other hand, how to review, identify and copyright the produced content will be a new issue.

Impact on Search Engines

· Improved quality of search results. Traditional search engines display results by keyword matching, but this approach can yield some low-quality, irrelevant, or even harmful results. In contrast, ChatGPT can understand the user's intent and provide more precise and personalized results, thereby improving the quality of search results.

·Improvement of search experience. Through ChatGPT technology, users can use natural language to search, rather than being limited to simple keyword matching. This approach makes searching more direct, faster, and easier to use.

· Cross-language search enhancements. Since ChatGPT can handle multiple languages, it makes it easier for users to search across languages. This will allow users around the world to find the information they need more quickly.

• Generate new search patterns. Since ChatGPT technology uses conversational interaction, it will generate new search modes such as voice search, image search, and more. These new search modes will make the search more convenient, and will also make the search engine more widely used.

——The above content is excerpted from

"ChatGPT's technical development path and impact" 2023.3

 

Guess you like

Origin blog.csdn.net/weixin_43802541/article/details/129583025