大模型训练-实战:模型、算力、数据训练实际情况【LLaMA系列(zhixi-13b)、mt5系列(mt5-xxl-13b)】

一、LLaMA系列

1、zhixi-13b-sft(包含官方lora模块)

(base) root@container-be6711b100-146dc186:~/tmp/zhixi-13b-sft# ls -l --block-size=m
total 49653M
-rwxr-xr-x 1 root root    1M Jul  1 12:07 config.json
-rwxr-xr-x 1 root root    1M Jul  1 12:07 generation_config.json
-rwxr-xr-x 1 root root 9496M Jul  1 12:15 pytorch_model-00001-of-00006.bin
-rwxr-xr-x 1 root root 9481M Jul  1 12:22 pytorch_model-00002-of-00006.bin
-rwxr-xr-x 1 root root 9481M Jul  1 12:29 pytorch_model-00003-of-00006.bin
-rwxr-xr-x 1 root root 9411M Jul  1 12:36 pytorch_model-00004-of-00006.bin
-rwxr-xr-x 1 root root 9411M Jul  1 12:43 pytorch_model-00005-of-00006.bin
-rwxr-xr-x 1 root root 2376M Jul  1 12:45 pytorch_model-00006-of-00006.bin
-rwxr-xr-x 1 root root    1M Jul  1 12:45 pytorch_model.bin.index.json
-rwxr-xr-x 1 root root    1M Jul  1 12:45 special_tokens_map.json
-rwxr-xr-x 1 root root    1M Jul  1 12:45 tokenizer_config.json
-rwxr-xr-x 1 root root   

猜你喜欢

转载自blog.csdn.net/u013250861/article/details/131489398
今日推荐