Volume model technology, but also volume engineering realization!

The repeated breakthroughs in large-scale model technologies such as GPT and Llama have detonated a new round of shocks in the global AI industry. Hundreds of models have emerged in China. All the top scientific research talents are focusing on how to improve the number of model parameters and model effects.

Jia Yangqing, an expert in the field of artificial intelligence, once mentioned the concept of the shelf life of the model. He believes that since the release of AlexNet in 2012, after the release of each large-scale model with strong performance, it will take only six months to a year. A model with close performance appears.

As more high-quality general-purpose large models are gradually open sourced, the technical barriers between models are expected to be further eliminated. How to reduce the cost of AI infrastructure and model implementation will become a topic of concern for enterprises, teams, and individual developers.

This requires enterprises to have a comprehensive understanding of artificial intelligence technology, and to adjust, optimize, and even restructure their own infrastructure and R&D processes. In the construction of AI infrastructure, it is necessary to pay attention to the entire link of computing power clusters, data storage, model training, and reasoning deployment. In the link of AI infrastructure, you can choose to use existing cloud services, or choose open source infrastructure products for privatization.

In a market environment where high-performance computing power is severely scarce, a robust and efficient AI research and development base can fully improve team combat efficiency. In addition to AI scientific research capabilities, enabling the R&D team to have excellent engineering capabilities has become the key to winning battles and overtaking on curves in the AI ​​era.

On the afternoon of August 12, 2023 (this Saturday), a Programmers Conference "NPCon: AI Model Technology and Application Summit - Building an AI R&D Base with Full Link" will be held at the Royal Grand Skylight Hotel, Chaoyang District, Beijing. Let's discuss together: how to choose a suitable AI infrastructure construction plan for enterprises of different scales and stages, and how to efficiently improve the full-link research and development process of AI.

858705c991852c7084ec7d147c54e726.png

full agenda

d7d70a6f123dcb45d1690ec45f456f05.jpeg

74170b3843cde3f11f7de76a60d8c5cd.png

Sharing guests and content introduction

adb65d6b79932bd073604af9e8ee4b21.jpeg

Keynote Speech

"Large Models Emerge, How to Deploy Training Architecture and Computing Chips"

A professional member of the International Computer Society (ACM) and the China Computer Federation (CCF), about 70+ Chinese and American invention patents, author of "GPT-4 Large Model Hard Core Interpretation", "GPT-4 Core Technology Analysis Report", "GPGPU Chip Design "

Speech content:

With the rapid growth and popularization of large model technologies such as GPT, LLM/MLM's open source resources, deployment training architecture, and computing power costs have become the key watershed for the implementation of large model applications and the ROI of corporate profits and losses. The importance is no less than model accuracy. This topic will share the open source Foundation Model and Finetune of GPT and other large models, open source application platforms, deployment and training architectures, and the important impact of GPU and DSA chips on the overall cost of deployment. It has key reference significance for product business design and large model deployment of MaaS enterprises.

Keynote Speech

Panoramic Analysis of LLM Application Technology Stack and Agent

With more than 11 years of experience in the Internet industry, he has a deep understanding of product design, agile project management, DevOps, learning organizational culture and Web application development. In recent years, he has extensive experience in product, operation and technical management of SaaS in the field of enterprise services and tool software.

e37959ec3c7129db7deb2d7b6bdc9990.jpeg

Speech content:

It will summarize the panorama of the current large-scale model application technology stack, sort out the realized capabilities of the application technology stack and the problems to be solved, as well as the position and capability construction of AI application tool platforms such as Dify in the large-scale model ecology, and analyze the current large-scale model technology development. and the trend of application development.

7973e3e507175d730deb7f8450a37753.jpeg

Keynote Speech

Meituan Visual GPU Inference Service Deployment Architecture Optimization Practice

The Visual Intelligence Department of Meituan, as a software development engineer, is mainly responsible for service platform development, visual model deployment scheme design, and GPU service performance optimization. At the same time, it is also very concerned about the cutting-edge progress and applications in the field of artificial intelligence, and actively embraces the changes brought about by new technologies. In my spare time, I love cycling, and I welcome you to communicate and learn, explore nature and enjoy the fun of cycling.

Speech content:

Meituan Vision is committed to applying visual AI technology to various scenarios of local life services. However, as the GPU resources used by online inference services continue to increase, the problem of low GPU utilization has become increasingly prominent, resulting in a large waste of computing resources. After experimental analysis, we found that inference services with low GPU utilization have a common problem: CPU operators and GPU operators are coupled to each other in the model structure, which seriously affects operating efficiency. To this end, we propose a general and efficient deployment architecture to solve this common performance bottleneck problem through model structure splitting and micro-service. At present, this solution has been successfully applied to Meituan's core visual service. After optimization, the GPU utilization rate of the service is close to 100%, and the service performance has been doubled.

Keynote Speech

AI Large Model Computing Power System Analysis

Dr.Zhu works in the artificial intelligence and high-performance application software department of Inspur Information, and is responsible for the research and development of cutting-edge AI algorithms such as large models and AIGC and the implementation of AI applications.

d43cb67fb57b0bea563fccc10d6164e3.jpeg

Speech content:

Combining the work practice of the Inspur information source large model and the construction experience of the computing power platform of the Intelligent Computing Center, share the technical trends, computing power requirements and computing characteristics of the current AI basic large model, as well as the main technical challenges currently facing the industry.

eec5aa74a8a639433cb75d230669c8dc.jpeg

lightning speech

Let your AI application land quickly

Full-stack developer, founder of the well-known open source project ViewDesign (iView).

Author of "Vue.js Combat", "Vue.js Components Intensive", and organized many Vue.js activities.

Speech content:

InsCode (inscode.net) is a one-stop application development service platform. With the support of AI, it solves the whole link development work from development-deployment-operation and maintenance-operation.

"Let's talk" round table

A New Paradigm for R&D in the AI ​​Era

Evolution with developer capabilities

Graduated from Beijing University of Aeronautics and Astronautics, from first-line software and algorithm engineer to CTO of start-up companies.

Entering the capital market, he has been focusing on the direction of Data&AI Infra for many years, providing consulting services to many well-known start-up companies.

38dc6103d1b68ab1ab608780007a3239.jpeg

1a9112889944437a647aa577b1ffff7d.jpeg

"Let's talk" round table

A New Paradigm for R&D in the AI ​​Era

Evolution with developer capabilities

Nankai University and the University of Minnesota jointly cultivated doctors. The research team led by him focused on the frontier exploration of recommendation systems, information retrieval, and causal inference technologies, and implemented pre-research results in more than 30 products/scenarios of the company, published more than 50 papers, and applied for More than 40 patents, and served as senior program committee member/program committee member/reviewer in academic organizations such as ACM SIGIR, SIGKDD, etc., and translated the Chinese version of "The Singularity Is Near".

"Let's talk" round table

A New Paradigm for R&D in the AI ​​Era

Evolution with developer capabilities

Graduated from the School of Computer Science, Beihang University, and has been working on NLP algorithms.

Currently, he is an algorithm expert of the CodeGeeX team of Zhipu AI, and his professional field is large model training and application.

4d3d9d1f0ac97f373dc4f3bb2f0b20c8.jpeg

04796dbdbeb45680e685ea2077eb5134.jpeg

"Let's talk" round table

A New Paradigm for R&D in the AI ​​Era

Evolution with developer capabilities

Bachelor and Master of Computer Science, Tsinghua University; more than 10 years of experience in new technology research and development and innovation management. He used to be the chief architect of the Research Institute of Star Times Media Group, responsible for the design and construction of the triple play video system covering more than ten countries in Africa; he is a serial entrepreneur; he has dozens of national invention patents.

93651e353fc59ca6cc81e16874cc9dba.jpeg

Guess you like

Origin blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/132241755