Speech recognition open source library FastARS ported to GEC6818 development version

FastARS porting

Transplant and install fftw3

1. Download source code 
wget -c http://www.fftw.org/fftw-3.3.10.tar.gz 
​2.
Unzip 
tar -xzvf fftw-3.3.10.tar.gz 
cd fftw-3.3.10 
/ 
3. Configure the current running and compiling environment 
export CC=arm-linux-gcc 
export CXX=arm-linux-g++ 
mkdir /usr/local/opt/fftw/ -p   
sudo chmod 777 /usr/local/opt/fftw/ 
./ configure --host=arm-linux --enable-shared --enable-float 
            --prefix=/usr/local/opt/fftw/ 
​4.
Compile and install 
make   
make install

Porting OpenBLAS

1. Download source code 
wget -c https://github.com/xianyi/OpenBLAS/releases/download/v0.3.20/OpenBLAS-0.3.20.tar.gz 
​2.
Unzip 
tar -xzvf OpenBLAS-0.3.20.tar .gz   
cd OpenBLAS-0.3.20 
    
3. Compile 
make TARGET=ARMV7 HOSTCC=gcc BINARY=32 CC=arm-linux-gcc FC=arm-linux-gfortran 
   
4. Install 
sudo mkdir /usr/local/opt/openblas/ - p 
sudo make PREFIX=/usr/local/opt/openblas/install   

Porting FastARS

1. Download the latest version of the source code 
git clone https://github.com/chenkui164/FastASR.git 
​2.
Compile the latest version of the source code, 
cd FastASR/ 
mkdir build 
cd build

3. Write a cmake script for cross compilation

vi arm_linux_setup.cmake 
    
#Fill in the following content 
set(CMAKE_SYSTEM_NAME Linux) 
set(CMAKE_SYSTEM_PROCESSOR arm) 
set(CMAKE_C_COMPILER /usr/local/arm/5.4.0/usr/bin/arm-linux-gcc) 
set(CMAKE_CXX_COMPILER /usr/local/ arm/5.4.0/usr/bin/arm-linux-g++)    
    
parameter description: CMAKE_C_COMPILER sets the path of the cross compiler 
         CMAKE_CXX_COMPILER sets the path of the cross compiler  

4. Generate makefile script

cmake -DCMAKE_TOOLCHAIN_FILE=./arm_linux_setup.cmake ..

5. Compile and install

make 
make install 

6. Enter the examples directory to see if it is successfully generated

 

Ported to GEC6818 development board

1. Download the generated k2_rnnt2_cli to the /bin directory of the development board

2. Download all library files to the /lib directory of the development board

 

 

3. Download the voice network model to the development version (see the original author github for model conversion)

 

4. Test use

[root@GEC6818 /]#k2_rnnt2_cli /yyy my.wav    
Audio time is 5.029750 s. len is 80476 
Model initialization takes 9.790232s. 
Result: "Have you eaten yet?" 
Model inference takes 18.692995s. 
[root@GEC6818 /]# 
    
//command description 
k2_rnnt2_cli /yyy my.wav     
k2_rnnt2_cli: speech recognition program 
/yyy: vocab.txt wenet_params.bin directory where the model is stored   
my.wav: audio file to be recognized

PS: Because the current development version does not have a GPU and uses a 32-bit compiler, the recognition time is longer.

Attachment: Related documents that have been transplanted

Link: Baidu Netdisk Please enter the extraction code Extraction code: 2333 --Share from super member V4 of Baidu Netdisk

Guess you like

Origin blog.csdn.net/qq_34548424/article/details/127843064