AIGC Industry Research Report 2023 - Language Generation

Analysys: Since the beginning of this year, with the continuous breakthrough and iteration of artificial intelligence technology, the topic of generative AI has become a hot topic many times, and the industrial development, market response and corresponding regulatory requirements of artificial intelligence content generation (AIGC) have also received extensive attention. In order to better explore the feasibility and development trend of its application in various industries, Analysys has explored the AIGC industry and will release a series of AIGC industry research reports.

From the perspective of content generation mode, the report covers the technological development, key capabilities and typical application scenarios of AIGC in the fields of language generation, image generation, audio generation, video generation, 3D generation, molecular discovery and circuit design (graph generation), etc. The challenges and prospects of my country's AIGC industry in the process of commercialization. It is hoped that by combing and grasping the development context of the AIGC industry, it will provide reference for application developers and users in various fields.

definition

Language generation means that the semantic probability model learned by the neural network can generate language according to the task requirements. The generated language includes natural language, programming language and logic language.

Since the vast majority of knowledge and experience are recorded and stored in language, especially natural language, and language is also the basis of communication, language generation has a wide range of application methods and application scenarios.

Main types and fields of application

Language generation applications can be divided into general language generation applications and vertical language generation applications according to the pertinence of their application capabilities. General language generation applications have a lot of general domain knowledge, and can complete different types of language generation tasks according to requirements, such as writing emails, simulating dialogue, generating code, etc. Compared with general language generation applications, vertical language generation applications have certain general domain In addition to knowledge, you also have professional field knowledge, and usually the application mode design is more in line with the requirements of professional field applications.

At present, language generation has been widely used in many industries. The financial industry uses language generation applications to analyze a large number of financial materials such as financial reports and corporate periodic reports to generate key information summaries and investment strategy recommendations. Language generation applications can also generate data analysis reports based on financial data; e-commerce uses language generation applications to generate product descriptions. Language generation applications can also be used to analyze product reviews, and can also be used to generate product recommendations for customers; news and media use language generation applications to automatically generate news reports and create content; education uses language generation applications to assist teachers in generating teaching plans and Teaching plans, assisting teachers in correcting homework, and providing learning guidance for students; medical treatment uses language generation applications to assist doctors in writing medical plans and cases, and help patients match medical resources.

Language generation has also been applied in several domains. Marketing uses language generation applications to generate marketing content such as blog articles, social media posts, and advertising copywriting; sales uses language generation applications to generate quotations, sales plans, and sales contracts, and uses language generation applications to analyze market data and sales data to form sales forecasts and Suggestions on sales plans, etc.; product development uses language generation applications to assist in the development of IT products, test products and form product documents, product manuals and test reports, etc. Customer service uses language generation applications to assist customer service staff in analyzing customer intentions and customer problems, generating feedback and problem solutions, etc.; in office, language generation applications can also be used to write official documents, summarize meeting materials and agendas, refine key behaviors, and perform team synchronization, etc. .

The current application of language generation is mainly to generate content and provide interaction. Language-generated content is typically written textual content that is factual, functional, or entertaining, such as blog posts, news, emails, novels, code, and more. In terms of content generation, news, media, marketing, advertising, office and other industries and scenarios were applied earlier. The early content generation was based on template generation, that is, only fixed content can be generated according to templates, such as generating fixed-format contracts. , Extract financial and financial information from the news and fill in templates, etc. The generated text content has high accuracy and the generation process has low requirements for infrastructure. However, this type of language generation can only be applied to language generation tasks with a high degree of modeling The content lacks imagination and creativity, so its application has great limitations. With the advancement of technology, language generation applications can generate low-mode content, the imagination and creativity of generated content have been greatly improved, and language generation applications can be applied to more scenarios. Such as generating advertising copy, product description, blog post, marketing plan, business email, etc., the enhancement of imagination and creativity in generating ability enables language generation application to assist the creation of literary content. Language generation applications can also summarize and summarize various types of content.

Compared with content generation, language generation applications need to understand the emotions contained in the language more accurately and be able to make appropriate answers when providing interactions. They also need to have the ability to remember the previous text in multiple rounds of dialogue. In some application scenarios There are also higher requirements for reasoning ability. For example, as an important interaction scenario, intelligent customer service has a low level of intelligence for a long time, making it difficult to accurately understand customer intentions, and it is even more difficult to achieve effective conversion of customers. At present, the application of language generation has greatly improved the intelligence level of intelligent customer service. In addition to accurately understanding customer intentions, it can also complete more complex tasks such as processing order status, querying transportation status and product information, and can communicate with customers in a personalized, intelligent manner. Recommend products and activities, etc., to increase the conversion rate of customers. Due to the improvement of capabilities, language generation has been applied to various interactive scenarios such as psychological counseling, teaching counseling, medical guidance, and virtual entertainment.

A critical stage of technological development

 

● Before 2017

Due to hardware and technical limitations, the semantic probability model represented by the recurrent neural network structure is weak in language understanding and generation, so language generation applications perform poorly, and can only complete highly patterned language generation tasks, such as document Fill in, extract key information from text files in specific formats, etc.

● 2017: technology development period

The Transformer neural network structure proposed by Google in 2017 has greatly enhanced the ability to build complex semantic probability models, and the language understanding and generation capabilities of the model have been significantly improved. Transformer has laid a solid technical foundation for language generation applications, but language generation applications at this stage can only complete highly patterned language generation tasks.

● 2018-2019: model exploration period

Between 2018 and 2019, the complexity of semantic probabilistic models continued to increase. According to downstream tasks, language models can be divided into language understanding models and language generation models. Semantic probability models represented by GPT series models and OPT models pay more attention to improving language generation capabilities, and also complete language generation applications such as text summarization, text writing, etc. Low-pattern tasks provide technical support.

● 2020-2021: application exploration period

From 2020 to 2021, the complexity of the semantic probability model will continue to increase. The language understanding and generation capabilities of the model have initially met the application requirements of low-mode language generation tasks. The academic and industrial circles have begun to explore the development of language generation applications. For example, the GPT-3 model proposed by OpenAI has attracted widespread attention from the industry as soon as it came out. Industrial companies such as Jarvis (now Jasper), Viable, and Fable are actively cooperating with OpenAI to explore the development of language generation applications.

At this stage, the semantic probability model is only close to the human level in some downstream tasks, so the exploration of the productization and commercialization of language generation applications is not extensive enough, but language generation applications can already complete low-mode language generation tasks.

● 2022-present: application acceleration period

In 2022, academia and industry will adjust the content generated by the semantic probability model to align the content generated by the model with human judgment standards, and accelerate the commercialization of language generation applications. The generative generalization ability, reasoning ability, and ability to follow instructions in the semantic probability model also expand the application scenarios of language generation.

At this stage, the LaMDA model launched by Google showed amazing dialogue generation capabilities, but it did not productize and commercialize the model. The InstructGPT model set developed by OpenAI strengthens the generalization ability, reasoning ability, and ability to follow instructions through different fine-tuning methods, and accelerates the commercial exploration of language generation applications in different fields by providing GPT-3.5 model services.

At the end of 2022, ChatGPT, which OpenAI will provide to the public, announces that language generation applications have entered the era of large-scale commercialization. ChatGPT has refreshed the public's understanding of language generation. Its large amount of general domain knowledge and the ability to complete complex tasks make it possible for language generation applications to be commercialized in multiple fields.

At the beginning of 2023, the enthusiasm for productization and commercialization of language generation applications will rise sharply. The GPT-4 model proposed by OpenAI has refreshed the performance of tests designed for humans such as SAT and IELTs. Its ability to accept language and image input has also expanded the commercial dimension of language generation applications. All sectors of society quickly recognized language generation. The commercial value of the application, finance, education, media, government, medical and other industries are actively exploring the application scenarios of language generation applications. Language generation applications have achieved good application results in marketing, training, recruitment, entertainment and other links.

mainstream application

● Overseas market conditions

At present, in terms of language generation applications, the overseas market is led by OpenAI, and technology giants such as Google and Anthropic are vying to catch up with start-up companies, forming an upstream market competition of "one super and many strong".

OpenAI is a leading research startup company, and its product ChatGPT is currently the most representative general language generation application. Based on its large amount of general domain knowledge, ChatGPT can complete multiple language generation tasks such as text writing, factual question and answer, virtual character interaction, code generation, etc. ChatGPT's powerful language generation capabilities have attracted a large number of users in a short period of time and accumulated a considerable brand volume. Cooperation with enterprises and institutions in different fields such as Microsoft, Morgan Stanley, Duolingo, and the Icelandic government not only proves the versatility of language generation and enhances the service capabilities of OpenAI, but also quickly builds an industrial application ecosystem with OpenAI as the core, and the formed data Closed loop and application expansion also help OpenAI form long-term market competitiveness. At present, ChatGPT for individual users adopts a free-added value-added subscription system, and for enterprise users, it adopts a charging model based on the amount of input and generation.

As a new generation of technology giants, Google has not seized the opportunity in the field of language generation applications. Its language generation application Bard, which is a benchmark against ChatGPT, has not yet formed a business model, but its goal is to integrate Bard into Google's product system and improve its The ability of product ecology to compete with Microsoft product ecology. Bard can complete the same types of tasks as ChatGPT, and supports calling the Google search engine to compete with New Bing. Bard is currently free and open to the public, but its interface is still in beta. At present, the generated results of Bard can be exported to Google Doc and Gmail, and can also be applied in Google Workspace.

Anthropic is a research-based start-up company for artificial intelligence security. Its product Claude is very similar to ChatGPT in terms of product capabilities, product positioning, and business model. However, Claude pays more attention to the assistance of language generation applications to human beings, emphasizing the security of applications, and provides enterprises with artificial intelligence application security services from data to models to systems. Claude for enterprise users also uses input and generation fees. mode, and the price is 30%-50% of ChatGPT. Anthropic does not develop individual language generation applications independently, but explores the application direction and application mode of language generation together with its partners.

In the downstream market, led by Microsoft, many low-end manufacturers and start-up teams create language generation applications based on scenarios, forming a competitive situation where Microsoft "sees all mountains and small things at a glance".

As a veteran technology giant, Microsoft cooperates with OpenAI to combine language generation capabilities with Microsoft's product ecosystem, explore and expand the application potential of language generation scenarios, and greatly enhance the commercial competitiveness of Microsoft's product ecosystem. New Bing, as a representative application of Microsoft's general language generation, not only combines search engine functions to bring users a better search experience, but also directly provides text generation functions, and New Bing connected to the Edge browser can provide web page information summary functions. New Bing's search capability and the feature called in the Edge browser have attracted a large number of users in a short period of time, which directly threatens Google's search business and the user activity of Google Chrome browser commercially. Microsoft launched the code generation application Copilot X on GitHub, the world's largest code hosting platform. At the same time, it connected the language generation capability to office software and launched Microsoft 365 Copilot. OpenAI's language generation capability can also be directly invoked in Microsoft's cloud service Azure. Due to the extensiveness of Microsoft's product and business matrix and the high application penetration rate in office applications and code hosting platforms, its language generation applications have very strong commercial competitiveness in both general and vertical fields.

As one of the first companies to try to commercialize language generation applications, Jasper began to try to use GPT-3 as the core to develop language generation applications for marketing copywriting in 2020. Currently, Jasper can be used for advertising copywriting, product descriptions, blog posts , marketing planning, commercial emails, video creative documents and other types of commercial text content generation, the target users hope that the text content can be spread more widely. The template provided by Jasper reduces the difficulty of writing all kinds of commercial text content, and conforms to the writing process of commercial text content. Jasper can also match the language generation model according to the different requirements of users to provide better generation results. At the same time, Jasper enhances the ability to integrate with user scenarios through open APIs and browser plug-ins. Because the generated commercial text content has stronger dissemination, Jasper has a stronger premium ability. Jasper currently adopts a free trial subscription system, and its subscription price is about 50% higher than that of ChatGPT.

Poe is a chat robot application developed by Quora. It integrates multiple language generation applications such as OpenAI, Anthropic, and Neeva, provides customized chat robot services, and integrates user-customized chat robots into a community. Users can use different functions in the community. Chatbots, such as programming robots in different languages, image generation prompt robots, recipe robots, virtual character dialogue robots, etc. Poe currently adopts a freemium subscription model, and the subscription price is basically the same as that of ChatGPT.

In addition to Jasper and Poe, there are many representative language generation applications, such as Duolingo for foreign language learning, Khan Academy for education, BloombergGPT for financial analysis, etc. At present, language generation applications in overseas markets emerge in endlessly. Many applications develop corresponding language generation applications based on their original products, services and user groups. For example, Snapchat, Whatsapp, and Discord all develop their chatbot products, and Tripadvisor and Getaiway develop their travel advice products. , Salesforce develops products such as email writing and automatic reply based on its CRM platform. Such applications are usually used to enrich its product capabilities to compete with similar products. Many products use language generation applications as paid functions or premium functions, which are paid functions for individual users Usually a subscription system is adopted, while products and services for enterprise users enhance the premium ability of their products and services by including language generation applications. There are also many new language generation applications, which are mainly optimized for applications based on language generation capabilities based on a certain type of subdivision scenario or usage pattern, such as A/B testing for A/B testing, AYOA for generating mind maps, ArxivGPT, etc., which are used to summarize papers, are mostly in the form of web pages, APIs, and browser plug-ins, with various charging models. Most of the one-time payment applications also need to purchase the language model API of OpenAI and Anthropic.

● Chinese market conditions

The Chinese language generation application market is similar to the overseas market and can be divided into an upstream market and a downstream market. The main players in the upstream market can be divided into cloud vendors represented by Baidu Smart Cloud and Alibaba Cloud; veteran artificial intelligence solution providers represented by SenseTime, iFLYTEK, and 4Paradigm; Start-ups represented by Yuanyu Intelligence, MiniMax, etc.; academic enterprises and teams represented by Zhipu AI, Fudan Professor Qiu Xipeng’s team, etc.

In terms of cloud vendors, Baidu Wenxin Yiyan is fully benchmarked against ChatGPT in terms of product capabilities. Currently, Wenxin Yiyan product experience is free for individual users. Wenxin Yiyan can also be combined with Baidu search engine to improve user search experience. And Baidu announced that it will connect Wenxin Yiyan to Baidu's applications such as intelligent voice assistant Xiaodu and Baidu Wenku, and also open Wenxin Yiyan's language generation capabilities to enterprises to explore the application scenarios of language generation; Ali Tongyi Qianwen also benchmarked ChatGPT, which is currently undergoing internal beta testing by invitation, has not yet launched a language generation application to the public.

In terms of established artificial intelligence solution providers, SenseTime is positioned as a general-purpose language generation application. It is currently undergoing internal testing with an invitation system, and will launch medical consultation · big doctor and programming consultation · AI code assistant; University of Science and Technology of China Xunfei opens Xunfei Xunhuo product experience to the public, and its product capabilities are also benchmarked against ChatGPT. It will develop vertical language generation applications based on Xunfei’s business in different fields such as education, office, and automobiles; the formula launched by 4Paradigm is aimed at enterprise software Vertical language generation applications for development scenarios are not open to individual users.

In terms of start-ups, the Mencius dialogue robot developed by Lanzhou Technology is still not open to the public, and its writing assistant language generation application is still relatively simple in product capabilities; Yuanyu Intelligence is currently open to the public to experience ChatYuan products; MiniMax currently has the ability to output language generation to enterprises ability, and develop the artificial intelligence chat software Glow for individual users.

In terms of academic enterprises and teams, Zhipu AI's ChatGLM and Fudan Professor Qiu Xipeng's team's MOSS product capabilities are all benchmarked against ChatGPT, and are currently in internal testing. ChatGLM has been open sourced, and MOSS will also be open sourced. The Chinese open source language generation model will greatly promote the development of the Chinese language generation application market.

The main participants in the downstream market can be divided into industry pioneers represented by the Agricultural Bank of China; scene application suppliers represented by WPS, Daguan Data and Unisound; represented by Xiaobing Company, Lingxin Smart, Caiyun, etc. application developers, etc.

The Agricultural Bank of China self-developed the ChatABC language generation model based on the open source model, and created banking industry language generation applications such as Decimal, Diting, and Tianshu to empower various financial services. WPS integrates the language generation capability of MiniMax to create a language generation application for office software, which has not yet been officially launched; Daguan Data develops the language generation model of Cao Zhi, and integrates it into its product matrix to create a vertical language generation application for text generation and processing; Yunzhi Sound is based on an open source model to create a language generation application for medical consultation, guidance and other scenarios, which is still in the internal testing stage. Xiaobing Company, Lingxin Smart, and Caiyun have all developed entertainment and companion chatbots for individual users, and all of them have been launched.

Similar to overseas markets, the Chinese language generation application market has many participants, is highly active, and has a relatively complete market structure. However, the commercialization maturity of the Chinese language generation application market is still relatively low. General-purpose language generation applications represented by Wenxin Yiyan, Tongyi Qianwen, Xunfei Xinghuo, etc. are still in the testing stage, and currently do not have a profitable basis. Language generation applications are also comparable to ChatGPT, Claude, etc. in overseas markets in terms of generation capabilities. There are gaps in representative products. Vertical language generation applications for enterprise users are similar to general-purpose language generation applications, most of which are still in the beta stage and lack representative use cases. The language generation application for individual users also has a large room for improvement in terms of intelligence. However, the capability of Chinese language generation models has been greatly improved in a short period of time, and the Chinese language generation application market has a wide range of application scenarios and strong application demands, so the commercialization pace of the market is expected to quickly match that of overseas markets.

 

Key capabilities for commercialization

 

 

● Build quality

Generation quality is the first critical capability for the commercialization of language generation applications. On the one hand, high-quality language generation requires the application to have an accurate understanding of the user's intent and the purpose of the task. On the other hand, it also needs to generate text that can accurately express meaning, and the generated language is appropriate and fluent, so as to obtain high-quality text content and interactive effects. For language generation applications, the key to improving the quality of generation is to form a closed loop of data. On the one hand, the training data scale and data quality can be improved through the user's application data feedback, and the knowledge field of the model and the existing rules in the application can be expanded from the bottom layer; on the other hand, the potential application pain points and application requirements can be discovered through the user's application data feedback , and design a prompt project based on this to improve the quality of language generation. For vertical language generation applications, it is even more necessary to deeply understand the knowledge structure and application requirements of related fields for specific industries or application scenarios, so as to generate high-quality texts that meet the requirements of vertical applications.

● Product Operations and Customer Support

Language generation applications need to reach long-term cooperation with customers, or need users to generate user stickiness, so product operation capabilities and customer support capabilities are required. When customers encounter problems or need help, they need to provide timely and professional technical support, as well as provide training and education courses to help customers better use language generation applications. Language generation applications also need to increase user stickiness through product iteration, distribution of discounts, and establishment of product communities. The commercialization of language generation applications also requires the effective transformation of language technology capabilities into easy-to-use products and services, which requires good product design and user experience design capabilities. The ease of use of the product is more conducive to the conversion and retention of users.

● Marketing ability

At present, language generation applications driven by large language models will become the mainstream, and the high training costs of large language models and the high inference costs in applications are factors that must be considered in language generation applications. Acquiring users and customers quickly through marketing can effectively dilute operating costs, and having a larger user group for downstream applications also means having a stronger bargaining power with upstream. And for the same type of language generation application, its potential user base size is relatively fixed, and the same type of application must increase user stickiness through product operation and customer support while increasing the hidden migration cost of users, so for language generation applications In terms of marketing, it is necessary to occupy a certain market size through marketing, so as to ensure that its commercialization has a profitable space.

In addition, through product operation capabilities, customer support capabilities, and marketing capabilities, language generation applications can gradually form user network effects, generate user clusters, form a double closed loop between users and product ecology, bring natural growth of users to applications, and promote applications. Create long-term business advantages.

● Customization and innovation capabilities

Customers in different industries may apply language generation to various scenarios, and due to differences in industries, scenarios, and work modes, customers may have requirements for customized applications. Therefore, language generation applications require customization capabilities to meet customers. demand.

For language generation applications, because applications can be decoupled and reconstructed from the semantic level, market segment requirements change rapidly, and all applications must face challenges from new applications. On the other hand, the ability to customize can promote user feedback on the pain points and needs in the application, and the ability to customize can also be effectively transformed into the innovation ability of products and services, which can better respond to changes in market demand. With the challenges brought by new applications, maintain the long-term market competitiveness of applications.

commercial risk

● Misinformation and Harmful Information

Because the underlying models of various language generation applications generate error messages and harmful information when generating languages, it is also difficult for language generation applications to avoid this problem. The generation of wrong information and harmful information may have a huge impact on brand reputation and product image, so it becomes a huge risk in the commercialization of language generation applications.

● Information security risk

In the process of using language to generate applications, because many products and services are based on public cloud services, or need to upload information to the supplier's server, there may be a risk of information leakage. For example, Samsung Group stated that group personnel leaked confidential chip information many times during the process of using the ChatGPT service, and some users said that ChatGPT would leak the input information of other users. Previously, Italy stated that ChatGPT services were banned across the country because ChatGPT could not prove that it met the requirements of GDPR. Other EU countries stated that they would pay attention to the data security risks brought by language generation applications.

Since information leakage will bring about legal proceedings, stricter government supervision, and possible negative social events, language generation applications need to pay attention to and avoid risks caused by information security.

● Technology and application substitution risks

Due to the fundamental nature of semantics, all kinds of applications can be decoupled and deconstructed from the semantic level. Therefore, many language generation applications may be difficult to maintain their commercial competitive advantages due to technological progress and application design iterations, so they will be quickly replaced. or alternatively.

Frontier exploration and development trend prospect

● Rapid improvement of language generation ability

From the perspective of the development of language models, in recent years, with the increase in model parameter scale, training data volume, and calculation volume, the language generation ability of the model has also rapidly improved, and new models such as multi-step reasoning, problem judgment, and instruction tuning have emerged. Competencies related to language production. The rapid improvement of language generation capabilities in the past two years has enabled language generation applications to create greater value for their customers and users, and various new capabilities have also become capabilities that must be considered when developing language generation applications. The improvement of language generation capabilities in the future will also accelerate the application penetration of language generation applications in various industries and scenarios, and open up more application paradigms.

● Highly customized language generation application

As the cost of language model training and reasoning decreases rapidly, the development difficulty of language generation applications also decreases rapidly, so the overall cost of language generation applications may decrease rapidly. Highly customized language generation applications such as generation methods become possible, and non-standardized language generation applications in future industrial applications will become one of the mainstream business forms.

The AIGC industry research report series is divided into six parts, including language generation, image generation, audio generation, video generation, 3D generation, molecular discovery and circuit design, and will be released successively this month. Welcome to pay attention And discuss with us the development of AIGC industry.

Disclaimer: The third-party data and other information quoted by Analysys in this article are all from public channels, and Analysys does not assume any responsibility for this. In any case, this article is for reference only, not as any basis. The copyright of this article belongs to the publisher. Without the authorization of Analysys Analysis, it is strictly forbidden to reproduce, quote or use any content released by Analysys Analysis. Any media, website or personal use after authorization should quote the original text and indicate the source, and the analysis point of view is subject to the official content released by Analysys Analysis, and any form of deletion, addition, splicing, deduction, distortion, etc. . Analysys does not assume any responsibility for disputes arising from improper use, and reserves the right to pursue responsibility against the relevant responsible parties.

Guess you like

Origin blog.csdn.net/qianfan_analysys/article/details/130698713