Huawei Ascend Server ubuntu20.04 Atlas Center Inference Card 23.0.RC3 NPU Driver and Firmware Installation Guide 02 (Atlas 300V pro) (Ascend 310P) (cann) Installation Process Record

Reference article: Atlas Center Inference Card 23.0.RC3 NPU Driver and Firmware Installation Guide 02

Reference article: https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/700alpha002/softwareinstall/instg/instg_0028.html

Version matching table

First, look at the version matching table. For example, our inference card is Atlas 300V pro. It depends on whether the system supports Atlas 300V pro? Find the appropriate version of the system (the kernel version must also match)

You can see that there is no Atlas 300V pro in the table, but in fact, by contacting technical personnel, we found that the ubuntu20.04 5.4.0-100-generic kernel supports installation. This needs to be tested or confirmed with Huawei technical personnel.

(We spent a lot of time installing the ubuntu20.04 system. When we encountered a kernel version mismatch, we tried to install other suitable kernel versions, but found that after switching the kernel version, the network card did not display directly and the network could not work properly; later we We tried to directly install the system image of the specified kernel version, but because we were connected to the external network, the kernel was automatically upgraded during the installation. Later, we disconnected the network for installation, and disabled the system upgrade function before connecting to the external network after installation, and the problem was solved)

Insert image description here

Must read for users

Insert image description here
Insert image description here

basic information

Server configuration information

We are Atlas 300V pro video analysis card

Insert image description here

Precautions

You cannot mix installation methods

Insert image description here
https://support.huawei.com/enterprise/zh/doc/EDOC1100332527/289e2d2d

Installation scenario description

Our scenario is to install driver firmware on a physical machine and then pull the official image for development and inference.

Physical machine installation

Insert image description here

Actual operation

Installation process

https://support.huawei.com/enterprise/zh/doc/EDOC1100332527/c6904c01

Insert image description here

We install it after reinstalling the operating system every time. It should be installed in the order of驱动-->固件.

Insert image description here

Confirm operating system

https://support.huawei.com/enterprise/zh/doc/EDOC1100332527?section=j005

uname -m && cat /etc/*release

Insert image description here

root@ky:/home/HwHiAiUser# uname -m && cat /etc/*release
aarch64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.4 LTS"
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
root@ky:/home/HwHiAiUser#

Operating system kernel version:

We are using the ubuntu-20.04.4-live-server-arm64.iso image, which is installed offline by default. The kernel version is 5.4.0-100-generic (system upgrades need to be disabled, and the kernel version will be upgraded).

Get packages and supporting sheets

https://support.huawei.com/enterprise/zh/doc/EDOC1100332527?section=j006

  • Get the NPU card chip model of the device
    https://support.huawei.com/enterprise/zh/doc/EDOC1100332527?section=j006

    lspci -n -D | grep d500
    

    Insert image description here
    The device information can be queried, indicating that the chip model of the NPU card is Ascend 310P

  • Getting the software package
    It was a bit difficult for me to download it due to permission issues, but it seems that the operation and maintenance technology helped us download it
    Insert image description here

Create running user

https://support.huawei.com/enterprise/zh/doc/EDOC1100332527?section=j007

Run under root user:

groupadd HwHiAiUser
useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash

Insert image description here

Confirm installation

Confirm whether to install the driver or firmware first

https://support.huawei.com/enterprise/zh/doc/EDOC1100332527?section=j008

install driver
Confirm installation method

https://support.huawei.com/enterprise/zh/doc/EDOC1100332527/51429589

basic

There are three installation methods:

  • Binaries are installed directly
  • Source code compilation and installation
  • Refactoring driver package installation

Insert image description here

Because we have already installed an operating system that conforms to the kernel version, we can install it directly using the run package (direct installation of binary files).

This is the route we should take:

Insert image description here

System compatibility requirements

Insert image description here

environmental inspection
  • Operating system kernel version

    uname -r
    

    Insert image description here

    root@ky:/home/HwHiAiUser# uname -r
    5.4.0-100-generic
    root@ky:/home/HwHiAiUser#
    
    
  • Has the system installed the software package?

    lsmod | grep drv_pcie_host
    

    No content means the package has not been installed. Packages can be installed directly

  • Check whether the card is in place normally

    lspci | grep d500
    

    Insert image description here

  • Linux tools required during driver installation (omitted)

  • Related configuration files
    Insert image description here

There are three ways to choose 1. We choose the first way to install it with a binary file (.run package installation)
Install related basic dependencies

Reference article: https://www.hiascend.com/document/detail/zh/quick-installation/23.0.RC3/quickinstg/800_3000/quickinstg_800_3000_0013.html

Reference article: https://support.huawei.com/enterprise/zh/doc/EDOC1100332527/2645a51f

Insert image description here

apt update

Insert image description here

apt-get install -y gcc g++ make cmake zlib1g zlib1g-dev openssl libsqlite3-dev libssl-dev libffi-dev unzip pciutils net-tools libblas-dev gfortran libblas3 libopenblas-dev
install driver
./Ascend-hdk-310p-npu-driver_23.0.rc3_linux-aarch64.run --full
Install firmware
./Ascend-hdk-310p-npu-firmware_7.0.0.5.242.run --full
Installation related to cann

Reference article: https://www.hiascend.com/document/detail/zh/quick-installation/23.0.RC3/quickinstg/800_3000/quickinstg_800_3000_0013.html

Insert image description here

I only write here the steps based on our system version.

Installation related to python

Please refer here: https://www.hiascend.com/document/detail/zh/quick-installation/23.0.RC3/quickinstg/800_3000/quickinstg_800_3000_0013.html

Insert image description here

Install cann

/home/HwHiAiUserUnder contents:

./Ascend-cann-toolkit_7.0.0.alpha002_linux-aarch64.run --install
Revise~/.bashrc

After installing cann, you will be prompted:

Please make sure that the environment variables have been configured.
-  To take effect for all users, you can add "source /usr/local/Ascend/ascend-toolkit/set_env.sh" to /etc/profil                                                                                       e.
-  To take effect for current user, you can exec command below: source /usr/local/Ascend/ascend-toolkit/set_env.                                                                                       sh or add "source /usr/local/Ascend/ascend-toolkit/set_env.sh" to ~/.bashrc.

Insert image description here

Under root user:

vi ~/.bashrc

Add at the end:

source /usr/local/Ascend/ascend-toolkit/set_env.sh

Insert image description here

Check if cann is installed successfully
npm-smi info

If you can print the following information, it means there is no problem:

Insert image description here

Restart the system and then repeat the previous step, no problem
reboot

Switch to root execution and the returned information is normal:

npm-smi info

I don’t know why, but executing the above command under an ordinary user fails:

Insert image description here


At this point, the driver, firmware, and cann are successfully installed.

Guess you like

Origin blog.csdn.net/Dontla/article/details/135009119