As soon as the front foot of llama2 was open sourced, I tried to build 70B on the back foot

1 Apply for permission to download models

https://ai.meta.com/resources/models-and-libraries/llama-downloads/

Fill in a little authentication, I passed it in about 10 minutes this time

The content of the mail is as follows:

2 Download llama source code

git clone [email protected]:facebookresearch/llama.git

3 Download the model

Use download.sh in the source code to download

As shown below

The first step asks you to enter the authorization url in the email, which is very long and starts with https://download.llamameta.net

The second step allows you to enter the name of the model you want to download, here is 70B-chat

After that, several LICENSE and tokenizer.model will be downloaded, etc.

After that is the model file we need most. As shown below

4 Download Extras

2023-7-22 11:20:30, the time to start downloading is 2023-7-21 17:30, so many models have been downloaded in the past, but I just found out that an error was reported. . . .

I don't know if I can continue

Then re-execute the download.sh script and find that the downloaded model will be re-downloaded, ε=(´ο`*))) Alas! ! ! !

You can only change the source code and skip the downloaded ones.

I originally downloaded 00 01 02 03 04 05 06 and 07. Since 07 is the last one, I am not sure if the download is complete, so it is considered as not downloaded. In addition, 00 was overwritten when I retried the download.sh script. It is also incomplete, so I changed the download.sh script to the following picture

if [[ $s != "01" && $s != "02" && $s != "03" && $s != "04" && $s != "05" && $s != "06" ]]

wget xxxx

fi

2023-7-22 14:50:21 Finally the download is complete

The model is about 129G

5 Run the official demo

2023-7-24 22:10

The official said that 8 MPs are needed here, so I specified 8 GPUs when I ran

CUDA_VISIBLE_DEVICES=1,2,3,4,6,7,8,9 torchrun --nproc_per_node 8 --master_port=29501 example_chat_completion.py --ckpt_dir llama-2-70b-chat/ --tokenizer_path tokenizer.model --max_seq_len 512 --max_batch_size 4

After starting the command, check the GPU status, as shown in the figure below

View terminal output

It ran smoothly! 

6 fine-tuning

follow-up supplement

Guess you like

Origin blog.csdn.net/wade1010/article/details/131857538