[AI combat] llama.cpp quantified cuBLAS compilation; nvcc fatal: Value 'native' is not defined for option 'gpu-architecture'
Introduction to llama.cpp Quantization
For using the LLaMA model, quantifying this step is indispensable in terms of cost and user experience.
llama.cpp Quantitative deployment of llama Refer to this article: [AI Combat] llama.cpp Quantitative deployment of llama-33B
llama.cpp compile GPU version
1. Error description
compiled with cuBLAS