The DeepSpeed team presents DeepSpeed-Chat, a free and open-source solution and framework designed for training high-quality ChatGPT-style models using RLHF. It is simple (one-click operation), fast, and extremely low-cost, and is suitable for a variety of customers, including school research, start-up companies, and large-scale cloud training. Compared to SoTA, it is 15 times faster and can train 10B+ model sizes on a single GPU and 100B+ model sizes on multi-GPU systems.
As a stable and efficient large-scale deep learning acceleration system based on PyTorch, DeepSpeed is one of the earliest and best open source frameworks for deep learning in the industry. Recently, many well-known small and medium-sized ChatGPT-style model release background acceleration platforms have adopted DeepSpeed, including Databricks-Dolly, Huggingface-PEFT, LMFlow, etc.
For more exciting content, the DeepSpeed team authorized Kaiyuanshe to release it in the Chinese community , and it will be released at 9 am on April 13th, Beijing time , so stay tuned!