Jiuzhangyuanshi large model accelerates innovation and development of AI industry

Multimodality is the next important technical link in large artificial intelligence models. At the Jiuzhang Yunji DataCanvas new product launch conference, Miao Xu, chief AI scientist of Jiuzhang Yunji DataCanvas Company, focused on the technical route of the Jiuzhang Yuanshi large model in the multi-modal direction.

Insert image description here

Miao Xu, chief AI scientist of Jiuzhang Yunji DataCanvas Company

Speech Record

Hello everyone! My name is Miao Xu, and I am very happy to communicate with you online. I can also feel the domestic enthusiasm for artificial intelligence and expectations for the future of artificial intelligence in the United States. The wonderful speeches of Fang Bo and Yu Bo just opened a new chapter in artificial intelligence for us. I am also very excited. I am very honored to introduce Jiuzhang Yunji's work and efforts on multi-modal large models here.

Multimodality is an important technical link in large artificial intelligence models. All walks of life will face different problems, and solving these problems requires getting answers from different data forms. For example, in the financial field, it is necessary to understand a company's financial reports. Understanding financial reports is not enough. If we do not read market news and pay attention to various macroeconomic and microeconomic indicators, our AI will not be able to fully judge the health of the company and make decisions. Sound risk assessment. If you cannot combine multiple video streams and sensor data from multiple production lines, integrate them, and look at them together, you will not be able to have a complete solution to the safety hazards and quality issues that arise in the field.

Take the health industry as an example. If the AI ​​cannot compare the patient's detection images while reading the patient's pathology, nor can it refer to a very large medical knowledge base, our AI will not be able to diagnose reasonably and provide doctors with sufficient guidance and recommendations. All of this requires the metacognitive large model to have the ability to handle multiple modalities. There are two important links in the processing process, one is called coupling, and the other is called alignment. Fusion means integrating different data streams into one data stream, such as video, images, or text. Putting different languages ​​into a room so that they can communicate is not enough. After all, they still belong to different fields. It is a bit like chickens talking to ducks. Alignment means translating different languages ​​into the same language and speaking the same language. Language and multi-modality can be used for interactive communication, and the fully communicated and interoperable information can finally output a complete large model for content generation. Conversations can be generated, new images can be generated, it can be front-end or back-end, and it can even form a new application.

How to generate a new architecture is a popular multi-modal processing method now, and the Yuanshi large model also uses such an architecture for processing. This architecture is still not enough. If you want to achieve a universal effect, further refinement is needed. The main reason is that different applications, different scenarios, and the process of merging and aligning different information flows are very complex. In order to solve this complexity, we introduced a complete set of instructions in the prompt template to help define how to align in different situations. For example, we can define its fusion based on space, we can define its fusion based on time terms, and even define fusion and alignment in different scenarios for various logics. This flexible instruction set helps the Yuanshi large model to be applied to different scenarios, greatly improving its versatility.

Jiuzhang Yunji’s customers are basically experts in the industry. All walks of life actually have a very large amount of industry data. You may have hundreds of billions of knowledge maps, hundreds of big data grids, and big data. Tables are not yet the intelligence of big data, this is a very common thing. Without a fused, aligned and flexible instruction set, it is difficult to imagine being able to integrate these more complex structured and unstructured data. With such an architecture and such an instruction set, Yuanshi's large model can provide very good support for multi-modal data such as structured data and unstructured data. Of course, in the end, we can promote the simple and universal multi-modal instructions to the industry through fine-tuning, so as to achieve better results in the industry.

If the customer wants to dive further, has a lot of private data, and wants to construct his own large model through private data, we can provide the same support. As we all know, fine-tuning is a bit like dark cuisine. In order to solve new tasks, if it is not handled properly, it is likely to have new knowledge and forget a lot of previous knowledge and abilities. This problem also hinders the implementation of large-scale models in the professional field. A very important threshold lies ahead. In order to solve this threshold, we propose a new fine-tuning scheme called combinable fine-tuning. The core concept is to touch some parameters during the fine-tuning process to try to avoid relatively large damage to previous memories and to have relatively little damage to previous memories. We can decompose complex problems into many small problems, solve them individually, and finally combine the fine-tuned model parameters to form a customized large model. We hope to provide you with an easy customization topic through this new combined fine-tuning solution. Of course, in the end, a lot of the work we Jiuzhang Yunji have done is also so that customer experts from all walks of life can better strive for excellence in the future and provide make a greater contribution to the business.

Guess you like

Origin blog.csdn.net/weixin_46880696/article/details/131837942
Recommended