A large model with 7 billion parameters running on the iPhone, the latest achievement from Chen Tianqi's team

Source | Qubit | Public Account QbitAI

The threshold for running a large language model has been lowered to an iPhone.

Of course, there is also an Android version. As long as the phone’s RAM is 6G, does it sound more shocking than using 2060?

And this time it's an out-of-the-box version!

This project is called MLC LLM, and it is the same as the previous WebLLM from the team of well-known scholar Chen Tianqi.

So far, it has gained more than 6,800 stars on GitHub.

The currently testable models are RedPajama and Vicuna of the alpaca family (based on LLaMA).

The optional model parameters are 3 billion and 7 billion respectively, which is comparable to the scale of many online demos.

The RedPajama team said that this achievement opens up the possibility of privatized deployment of large models.

Without further ado, let's see how the experience is.

Writing skills are acceptable, but science is a shortcoming

We chose the Vicuna model among them for testing.

To test its literary prowess first, it wrote a poem for each of the four seasons.

Rhythm, basically two or three sentences and one rhyme.

In addition to poetry, stories can also be created, and even inner drama can be written.

Not just love, other types of stories can't beat it.

In addition to literary applications, we might as well try its practical functions.

We had it generate a recipe for Hawaiian pizza, and it looked pretty good.

Travel plans can also be arranged by it.

Let's take a look at how it performs in science and engineering.

Bear the brunt of the code, we let it try to write a piece of code in Python to find the maximum number.

Unexpectedly, it was an enumeration...but the result met the requirements.

def max_of_three(num1, num2, num3):
    if num1 > num2 and num1 > num3:
        return num1
    elif num2 > num1 and num2 > num3:
        return num2
    elif num3 > num1 and num3 > num2:
        return num3
    return None

num1 = 11
num2 = 45
num3 = 14
max_value = max_of_three(num1, num2, num3)
print("The maximum value is: ", max_value)

However, when encountering slightly difficult problems, its programming ability is somewhat stretched.

As for mathematics and logical reasoning, it is hard to say, but after all, there are limited parameters for adapting to mobile phones, which is understandable.

We also tried to ask questions in Chinese, but found that there are still some problems with the adaptation of Chinese.

In addition, the mobile APP does not yet have the function of saving chat records, so be careful when cutting out the interface.

Although the ability of large models that can run on mobile phones is still limited, the team also showed more development directions in the future.

For example, customize the model for users and interact with the cloud public basic model, provide offline support, App embedding, decentralization, etc.

how to install

This mockup supports iOS, Android mobile devices, as well as Windows and Mac

iOS users can install TestFlight first, and then apply for testing from the following portal:

Portal: https://testflight.apple.com/join/57zd7oxa

If the quota is full, you can also use the code on GitHub to compile and install it yourself

Portal: https://github.com/mlc-ai/mlc-llm

Android users can directly download the apk and install it. The first time you run it, you need to download the data package online.

Portal: https://github.com/mlc-ai/binary-mlc-llm-libs/raw/main/mlc-chat.apk

For desktop users , please refer to the official tutorial:

Portal: https://mlc.ai/mlc-llm/

Guess you like

Origin blog.csdn.net/lqfarmer/article/details/131131555