iFlytek voice offline command recognition

Preparation

  • Register an iFlytek account and do the relevant certification. Only if you pass the certification can you download some free resources. Official website address: https://console.xfyun.cn/

  • After creating my application, I can identify it in the offline command

Please add image description

1. Copy the necessary file packages to your own project directory

1. Place these packages in the libs directory

Note reloading some gradle

Insert image description here

2. Place the assets file in the app directory

The iflytek/recognize.xml file under this file is the UI file for speech recognition. If it does not exist, clicking on the speech recognition app will directly crash.

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-JhCKjT2W-1680255017857) (iFlytek Voice Recognition.assets/image-20230331101912961.png)]

3. Add some configuration in the build directory

The sourceSets directory must be added, otherwise an error will be reported (object creation failed, please confirm that libmsc.so is placed correctly,\n and createUtility is called for initialization). In essence, the package cannot find the SpeechRecognizer object and the creation of the SpeechRecognizer object fails. (tiankengtiankengtiankengtiankengtiankengtiankeng)

android {
    
    
    sourceSets{
    
    
        main{
    
    
            jniLibs.srcDir(['libs'])
        }
    }
}
dependencies {
    
    
    implementation files('libs/Msc.jar')  
}
4. Copy the tool class to the project file

Here are mainly some tools provided by iFlytek, the main ones used are json conversion tools

Under the setting file are the settings for voice command recognition and speech synthesis (not used temporarily)

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-c7hF2MxJ-1680255017859) (iFlytek Voice Recognition.assets/image-20230331102924007.png)]

5. Add permissions in AndroidManifest.xml
  <!--连接网络权限,用于执行云端语音能力 -->
    <uses-permission android:name="android.permission.INTERNET"/>
    <!--获取手机录音机使用权限,听写、识别、语义理解需要用到此权限 -->
    <uses-permission android:name="android.permission.RECORD_AUDIO"/>
    <!--读取网络信息状态 -->
    <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE"/>
    <!--外存储写权限,构建语法需要用到此权限 -->
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>

2. Create a new IatActivity and declare the necessary class members

private static String TAG = "IatActivity";
// 语音听写对象
private SpeechRecognizer mIat;
// 语音听写UI
private RecognizerDialog mIatDialog;
// 听写结果内容
private EditText mResultText;
// 用HashMap存储听写结果
private HashMap<String, String> mIatResults = new LinkedHashMap<>();
private SharedPreferences mSharedPreferences;
private Toast mToast;
private String mEngineType = "cloud";

3. Initialize monitoring

  private InitListener mInitListener = new InitListener() {
    
    

        @Override
        public void onInit(int code) {
    
    
            Log.d(TAG, "SpeechRecognizer init() code = " + code);
            if (code != ErrorCode.SUCCESS) {
    
    
                showTip("初始化失败,错误码:" + code + ",请点击网址https://www.xfyun.cn/document/error-code查询解决方案");
            }
        }
    };

4. Periodic monitoring of voice command recognition

    /**
     * 听写监听器。
     */
    private RecognizerListener mRecognizerListener = new RecognizerListener() {
    
    

        @Override
        public void onBeginOfSpeech() {
    
    
            // 此回调表示:sdk内部录音机已经准备好了,用户可以开始语音输入
            showTip("开始说话");
        }

        @Override
        public void onError(SpeechError error) {
    
    
            // Tips:
            // 错误码:10118(您没有说话),可能是录音机权限被禁,需要提示用户打开应用的录音权限。
            showTip(error.getPlainDescription(true));
        }

        @Override
        public void onEndOfSpeech() {
    
    
            // 此回调表示:检测到了语音的尾端点,已经进入识别过程,不再接受语音输入
            showTip("结束说话");
        }

        @Override
        public void onResult(RecognizerResult results, boolean isLast) {
    
    

            String text = JsonParser.parseIatResult(results.getResultString());
            mResultText.append(text);

            mResultText.setSelection(mResultText.length());
            if (isLast) {
    
    
                //TODO 最后的结果
            }
        }

        @Override
        public void onVolumeChanged(int volume, byte[] data) {
    
    
            showTip("当前正在说话,音量大小:" + volume);
            Log.d(TAG, "返回音频数据:" + data.length);
        }

        @Override
        public void onEvent(int eventType, int arg1, int arg2, Bundle obj) {
    
    
            // 以下代码用于获取与云端的会话id,当业务出错时将会话id提供给技术支持人员,可用于查询会话日志,定位出错原因
            // 若使用本地能力,会话id为null
            if (SpeechEvent.EVENT_SESSION_ID == eventType) {
    
    
                String sid = obj.getString(SpeechEvent.KEY_EVENT_AUDIO_URL);
                Log.d(TAG, "session id =" + sid);
            }
        }
    };

5. Offline command UI monitoring

 private RecognizerDialogListener mRecognizerDialogListener = new RecognizerDialogListener() {
    
    
        public void onResult(RecognizerResult results, boolean isLast) {
    
    
            Log.d(TAG, "recognizer result:" + results.getResultString());

            /**
             *  识别出的语言类容就是 text 解析了的json 文本 在这里可以 发送蓝牙数据
             */
            String text = JsonParser.parseIatResult(results.getResultString());
            System.out.println("语音听写结果为:" + text);
            if(!text.equals("")){
    
    
                System.out.println("蓝牙发送了");
                bluetoothUtils.write("1");
            }
            mResultText.append(text);
            mResultText.setSelection(mResultText.length());
        }

        /**
         * 识别回调错误.
         */
        public void onError(SpeechError error) {
    
    
            showTip(error.getPlainDescription(true));
        }

    };

6. Obtain offline resources

In fact, offline command recognition is not used here, because the .jet file is a voice pronunciation package, which is the free voice pronunciation package provided by the official

private String getResourcePath() {
    
    
    StringBuffer tempBuffer = new StringBuffer();
    //识别通用资源
    tempBuffer.append(ResourceUtil.generateResourcePath(this, ResourceUtil.RESOURCE_TYPE.assets, "iat/common.jet"));
    tempBuffer.append(";");
    tempBuffer.append(ResourceUtil.generateResourcePath(this, ResourceUtil.RESOURCE_TYPE.assets, "iat/sms_16k.jet"));
    //识别8k资源-使用8k的时候请解开注释
    return tempBuffer.toString();
}

7. Parameter settings

Parameter setting is mainly for configuring the command recognition function. Just like using springboot, it needs to be configured in yml. Here is just an example. In iFlytek, there are parameter settings for command recognition and speech synthesis, which need to be distinguished. good.

   /**
     * 参数设置
     *
     * @return
     */
    public void setParam() {
    
    
        // 清空参数
        mIat.setParameter(SpeechConstant.PARAMS, null);
        String lag = mSharedPreferences.getString("iat_language_preference", "mandarin");
        // 设置引擎
        mIat.setParameter(SpeechConstant.ENGINE_TYPE, mEngineType);
        // 设置返回结果格式
        mIat.setParameter(SpeechConstant.RESULT_TYPE, "json");

        //mIat.setParameter(MscKeys.REQUEST_AUDIO_URL,"true");

        //	this.mTranslateEnable = mSharedPreferences.getBoolean( this.getString(R.string.pref_key_translate), false );
        if (mEngineType.equals(SpeechConstant.TYPE_LOCAL)) {
    
    
            // 设置本地识别资源
            mIat.setParameter(ResourceUtil.ASR_RES_PATH, getResourcePath());
        }
        // 在线听写支持多种小语种,若想了解请下载在线听写能力,参看其speechDemo
        if (lag.equals("en_us")) {
    
    
            // 设置语言
            mIat.setParameter(SpeechConstant.LANGUAGE, "en_us");
            mIat.setParameter(SpeechConstant.ACCENT, null);

            // 设置语言
            mIat.setParameter(SpeechConstant.LANGUAGE, "zh_cn");
            // 设置语言区域
            mIat.setParameter(SpeechConstant.ACCENT, lag);
        }

        // 设置语音前端点:静音超时时间,即用户多长时间不说话则当做超时处理
        mIat.setParameter(SpeechConstant.VAD_BOS, mSharedPreferences.getString("iat_vadbos_preference", "4000"));

        // 设置语音后端点:后端点静音检测时间,即用户停止说话多长时间内即认为不再输入, 自动停止录音
        mIat.setParameter(SpeechConstant.VAD_EOS, mSharedPreferences.getString("iat_vadeos_preference", "1000"));

        // 设置标点符号,设置为"0"返回结果无标点,设置为"1"返回结果有标点
        mIat.setParameter(SpeechConstant.ASR_PTT, mSharedPreferences.getString("iat_punc_preference", "1"));

        // 设置音频保存路径,保存音频格式支持pcm、wav,设置路径为sd卡请注意WRITE_EXTERNAL_STORAGE权限
        mIat.setParameter(SpeechConstant.AUDIO_FORMAT, "wav");
        mIat.setParameter(SpeechConstant.ASR_AUDIO_PATH,
                getExternalFilesDir("msc").getAbsolutePath() + "/iat.wav");
    }

8. Pop-up prompts

/**
 * 弹窗提示
 * @param
 */
private void showTip(final String str) {
    
    
    if (mToast != null) {
    
    
        mToast.cancel();
    }
    mToast = Toast.makeText(getApplicationContext(), str, Toast.LENGTH_SHORT);
    mToast.show();
}

9. Voice permission prompt window

Normally when we use an app, we will obtain the permissions of the mobile phone, and the app can only be used with the user's consent. Here is such an effect

/**
 * 语音权限提示窗口
 * @param
 */
private void showPrivacyDialog() {
    
    
    AppCompatTextView textView = new AppCompatTextView(this);
    textView.setPadding(100, 50, 100, 50);
    textView.setText(
            HtmlCompat.fromHtml("我们非常重视对您个人信息的保护,承诺严格按照讯飞开放平台<font color='#3B5FF5'>《隐私政策》</font>保护及处理您的信息,是否确定同意?",
                    HtmlCompat.FROM_HTML_MODE_LEGACY));
    textView.setOnClickListener(new View.OnClickListener() {
    
    
        @Override
        public void onClick(View v) {
    
    
            Intent intent = new Intent(Intent.ACTION_VIEW);
            intent.setData(Uri.parse("https://www.xfyun.cn/doc/policy/sdk_privacy.html"));
            startActivity(intent);
        }
    });
    AlertDialog dialog = new AlertDialog.Builder(this)
            .setTitle("温馨提示")
            .setView(textView)
            .setPositiveButton("同意", new DialogInterface.OnClickListener() {
    
    
                @Override
                public void onClick(DialogInterface dialog, int which) {
    
    
                    mSharedPreferences.edit().putBoolean(SpeechApp.PRIVACY_KEY, true).apply();
                    dialog.dismiss();
                }
            })
            .setNegativeButton("不同意", new DialogInterface.OnClickListener() {
    
    
                @Override
                public void onClick(DialogInterface dialog, int which) {
    
    
                    mSharedPreferences.edit().putBoolean(SpeechApp.PRIVACY_KEY, false).apply();
                    finish();
                    System.exit(0);
                }
            })
            .create();
    dialog.setCanceledOnTouchOutside(false);
    dialog.show();
}

10. Dynamically apply for permissions

For higher versions of andriod, it is not enough to configure permissions in AndroidManifest.xml. You also need to dynamically apply for permissions in onCreate().

Note that to install 12 or above (sdk 33) you need to write like this

 String permissions[] = {
     
     android.Manifest.permission.RECORD_AUDIO,
               android.Manifest.permission.ACCESS_NETWORK_STATE,
               android.Manifest.permission.INTERNET,
               android.Manifest.permission.WRITE_EXTERNAL_STORAGE
       };

/**
 * android 6.0 以上需要动态申请权限
 */
private void initPermission() {
    
    
    String permissions[] = {
    
    Manifest.permission.RECORD_AUDIO,
            Manifest.permission.ACCESS_NETWORK_STATE,
            Manifest.permission.INTERNET,
            Manifest.permission.WRITE_EXTERNAL_STORAGE
    };

    ArrayList<String> toApplyList = new ArrayList<String>();

    for (String perm : permissions) {
    
    
        if (PackageManager.PERMISSION_GRANTED != ContextCompat.checkSelfPermission(this, perm)) {
    
    
            toApplyList.add(perm);
        }
    }
    String tmpList[] = new String[toApplyList.size()];
    if (!toApplyList.isEmpty()) {
    
    
        ActivityCompat.requestPermissions(this, toApplyList.toArray(tmpList), 123);
    }

}

11. Initialize in onCreate() method

 public void onCreate(Bundle savedInstanceState) {
    
    
        super.onCreate(savedInstanceState);
        this.requestWindowFeature(Window.FEATURE_NO_TITLE);
        setContentView(R.layout.activity_iat);

        // 初始化识别无UI识别对象
        // 使用SpeechRecognizer对象,可根据回调消息自定义界面;
        mIat = SpeechRecognizer.createRecognizer(this, mInitListener);

        // 初始化听写Dialog,如果只使用有UI听写功能,无需创建SpeechRecognizer
        // 使用UI听写功能,请根据sdk文件目录下的notice.txt,放置布局文件和图片资源
        mIatDialog = new RecognizerDialog(this, mInitListener);
        mSharedPreferences = getSharedPreferences(IatSettings.PREFER_NAME, Activity.MODE_PRIVATE);
        mResultText = findViewById(R.id.tv_result));
        Button_Test();
    }

12. Button click processing recognition command test

    private void Button_Test(){
    
    
        Button button = findViewById(R.id.button);
        button.setOnClickListener(new View.OnClickListener() {
    
    
            @Override
            public void onClick(View v) {
    
    

                if (null == mIat) {
    
    
                    // 创建单例失败,与 21001 错误为同样原因,参考 http://bbs.xfyun.cn/forum.php?mod=viewthread&tid=9688
                    showTip("创建对象失败,请确认 libmsc.so 放置正确,\n 且有调用 createUtility 进行初始化");
                    return;
                }

                mResultText.setText(null);// 清空显示内容
                mIatResults.clear();
                // 设置参数
                setParam();
                boolean isShowDialog = mSharedPreferences.getBoolean(getString(R.string.pref_key_iat_show), true);
                if (isShowDialog) {
    
    
                    try {
    
    
                        // 显示听写对话框
                        mIatDialog.setListener(mRecognizerDialogListener);
                        mIatDialog.show(); // todo 对话框显示不出来一直报错
                        showTip(getString(R.string.text_begin));
                    } catch (Exception e) {
    
    
                        throw new RuntimeException(e);
                    }
                } else {
    
    
                    // 不显示听写对话框
                    ret = mIat.startListening(mRecognizerListener);
                    if (ret != ErrorCode.SUCCESS) {
    
    
                        showTip("听写失败,错误码:" + ret + ",请点击网址https://www.xfyun.cn/document/error-code查询解决方案");
                    } else {
    
    
                        showTip(getString(R.string.text_begin));
                    }
                }
            }
        });
    }

13. Interface activity_iat.xml

<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent">

    <Button
        android:id="@+id/button"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginTop="93dp"
        android:layout_marginBottom="46dp"
        android:text="开始听写"
        app:layout_constraintBottom_toTopOf="@+id/tv_result"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.498"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent" />

    <EditText
        android:id="@+id/tv_result"
        android:layout_width="0dp"
        android:layout_height="0dp"
        android:layout_marginBottom="493dp"
        android:layout_weight="1"
        android:gravity="center"
        android:text="语音识别到的内容"
        android:textColor="#000"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/button" />

</androidx.constraintlayout.widget.ConstraintLayout>

14、strings.xml

This file is provided after downloading from iFlytek's official website. As long as it is some public strings, most of the projects you create will not use it. However, it is best to write it in, otherwise some may not find it and report errors (pitfalls). You can replace the unnecessary ones. delete

<?xml version="1.0" encoding="utf-8"?>
<resources>

    <string name="app_name">讯飞蓝牙集成app</string>
    <!-- 请替换成在语音云官网申请的appid -->
    <string name="app_id">23810fb7</string>
    <string name="example_explain">本示例为讯飞语音Android平台开发者提供语音听写、语法识别、语义理解和语音合成等代码样例,旨在让用户能够依据该示例快速开发出基于语音接口的应用程序。</string>
    <string name="text_tts_source">
        科大讯飞作为智能语音技术提供商,在智能语音技术领域有着长期的研究积累,并在中文语音合成、语音识别、口语评测等多项技术上拥有技术成果。科大讯飞是我国以语音技术为产业化方向的“国家863计划产业化基地”.....
    </string>
    <string name="text_tts_source_en">iFLYTEK is a national key software enterprise dedicated to the research of intelligent speech and language technologies, development of software and chip products, provision of speech information services, and integration of E-government systems. The intelligent speech technology of iFLYTEK, the core technology of the company, represents the top level in the world.
    </string>
    <string name="text_isr_abnf_hint">\t上传内容为:\n\t#ABNF 1.0 gb2312;\n\tlanguage zh-CN;\n\tmode voice;\n\troot $main;\n\t$main = $place1 到$place2 ;\n\t$place1 = 北京 | 武汉 | 南京 | 天津 | 东京;\n\t$place2 = 上海 | 合肥;</string>
    <string name="text_understand_hint">\t您可以说:\n\t今天的天气怎么样?\n\t北京到上海的火车?\n\t来首歌吧?\n\n\t更多语义请登录:\n\thttp://aiui.xfyun.cn/ \n\t配置您的专属语义吧!</string>

    <!-- 听写 -->
    <string name="text_begin">请开始说话&#8230;</string>
    <string name="text_begin_recognizer">开始音频流识别</string>
    <string name="text_upload_contacts">上传联系人</string>
    <string name="text_upload_userwords">上传用户词表</string>
    <string name="text_upload_success">上传成功</string>
    <string name="text_userword_empty">词表下载失败或内容为空</string>
    <string name="text_download_success">下载成功</string>
    <string name="pref_key_iat_show">iat_show</string>
    <string name="pref_title_iat_show">显示听写界面</string>

    <string name="pref_key_translate">translate</string>
    <string name="pref_title_translate">翻译</string>

    <!-- 合成 -->
    <string-array name="voicer_cloud_entries">
        <item>小燕</item>
        <item>小宇</item>
        <item>凯瑟琳</item>
        <item>亨利</item>
        <item>玛丽</item>
        <item>小研</item>
        <item>小琪</item>
        <item>小峰</item>
        <item>小梅</item>
        <item>小莉</item>
        <item>小蓉</item>
        <item>小芸</item>
        <item>小坤</item>
        <item>小强 </item>
        <item>小莹</item>
        <item>小新</item>
        <item>楠楠</item>
        <item>老孙</item>
    </string-array>
    <string-array name="voicer_cloud_values">
        <item>xiaoyan</item>
        <item>xiaoyu</item>
        <item>catherine</item>
        <item>henry</item>
        <item>vimary</item>
        <item>vixy</item>
        <item>xiaoqi</item>
        <item>vixf</item>
        <item>xiaomei</item>
        <item>xiaolin</item>
        <item>xiaorong</item>
        <item>xiaoqian</item>
        <item>xiaokun</item>
        <item>xiaoqiang</item>
        <item>vixying</item>
        <item>xiaoxin</item>
        <item>nannan</item>
        <item>vils</item>
    </string-array>
    <string-array name="voicer_xtts_entries">
        <item>nannan</item>
        <item>小关</item>
        <item>宜丰</item>
        <item>小曦</item>
        <item>小燕</item>
        <!--  <item>小峰</item>-->
    </string-array>
    <string-array name="voicer_xtts_values">
        <item>nannan</item>
        <item>xiaoguan</item>
        <item>yifeng</item>
        <item>xiaoxi</item>
        <item>xiaoyan</item>
        <!--   <item>xiaofeng</item>-->
    </string-array>
    <string-array name="voicer_local_entries">
        <item>小燕</item>
        <item>小峰</item>

    </string-array>
    <string-array name="voicer_local_values">
        <item>xiaoyan</item>
        <item>xiaofeng</item>
    </string-array>
    <string-array name="stream_entries">
        <item>通话</item>
        <item>系统</item>
        <item>铃声</item>
        <item>音乐</item>
        <item>闹铃</item>
        <item>通知</item>
    </string-array>
    <string-array name="stream_values">
        <item>0</item>
        <item>1</item>
        <item>2</item>
        <item>3</item>
        <item>4</item>
        <item>5</item>
    </string-array>
    <string name="tts_toast_format" formatted="false">缓冲进度为%d%%,播放进度为%d%%</string>
    <!-- 语言 -->
    <string-array name="language_entries">
        <item>普通话</item>
        <item>粤语</item>
        <item>英语</item>
    </string-array>
    <string-array name="language_values">
        <item>mandarin</item>
        <item>cantonese</item>
        <item>en_us</item>
    </string-array>

    <!-- 标点符号 -->
    <string-array name="punc_entries">
        <item>有标点</item>
        <item>无标点</item>
    </string-array>
    <string-array name="punc_values">
        <item>1</item>
        <item>0</item>
    </string-array>

    <!-- 唤醒 -->
    <string name="example_explain_wake">唤醒是指通过说出特定的唤醒词来唤醒处于休眠状态下的设备,又分为唤醒和唤醒+识别。</string>
    <string name="wake_demo_hint">请点击“开始唤醒”后读出您在开放平台购买的唤醒词,当引擎计算得分大于您设置的门限值时,即可进行唤醒。</string>
    <string name="oneshot_demo_hint">请点击“唤醒+识别”后读出您在开放平台购买的唤醒词+语法中指定的识别对象,当引擎计算得分大于您设置的门限值时,即可进行唤醒+识别。</string>
    <string name="oneshot_resource_hint">本示例识别词语为“张三|李四|张海洋”,请开发者将语法文件中的唤醒词字段替换,本地语法定义详见demo工程的asset/wake.bnf文件。</string>
</resources>

Test effect

Please add image description

The blogger plans to integrate voice and Bluetooth and send Bluetooth commands through voice commands. If you want to know more about Android Bluetooth, please refer to the following online article.

Complete code repository: Chafan_Matrix/Bluetooth_Send - Code Cloud - Open Source China (gitee.com)

Bluetooth development blog: Integrating Bluetooth app for andriod12 (sdk33) and above_Chafan_Matrix's blog-CSDN blog

おすすめ

転載: blog.csdn.net/weixin_45833112/article/details/129885232