DLite V2: lightweight, open, easy-to-customize LLM

introduce

AI Squared works to democratize artificial intelligence and make it available to everyone. However, two key forces oppose the democratization of AI — high-performance models tend to have a large number of parameters, making them prohibitively expensive to train, tune, and deploy at scale — and non-permissioned licensing prevents many open-source models For commercial purposes.

Insert image description here
Gaining high performance from smaller models will significantly reduce the startup and operational costs of building with large language models.

To address the scale/cost issues of the current situation, we released the DLite V1 series of models in April 2023, which are lightweight LLMs with parameters ranging from 124 million parameters to 1.5 billion parameters, exhibiting ChatGPT-like interactivity . The small size of these models means they can run on almost any device, including laptop CPUs, and are not limited to being deployed on dedicated, expensive cloud resources. However, at this time we are using the Alpaca dataset to tune the model, which prevents any DLite v1 series from being used for commercial purposes.

We have since updated the DLite series with DLite V2, which also has four different models ranging from 124 million to 1.5 billion parameters. The highlight of this update is our use of the "databricks-dolly-15k" dataset released by Databricks. We've also uploaded this dataset to our HuggingFace page so anyone can easily use it. Since this training dataset is also licensed for commercial use, we are also pleased to announce that all models in the DLite V2 series are also commercially available, enabling organizations to build on top of these models without any licensing restrictions .

Guess you like

Origin blog.csdn.net/iCloudEnd/article/details/132709816