IBM的语音识别(IBM speech to text 语言转换成文字)

1.登陆网址https://www.ibm.com/watson/developercloud/speech-to-text.html并注册

2.打开网址https://console.ng.bluemix.net/catalog/?category=watson,点击界面左侧菜单中的Watson,并选择Speech ToText

3.点击界面最下方的创建按钮

4.点击新建凭证

5.输入名称,点击添加

6.点击查看凭证,即可看到账号、密码和请求地址(凭证创建完成后,不一定能马上使用,有时需要过一段时间,具体时间不确定,我当天申请的当天测试一直连接不上,两天后再测的时候就可以了)

7.创建一个工程,将speech-android-wrapper作为library导入,并添加到工程中,build.gradle中的SDK版本号不要超过23,否则会报org.apache.http找不到,因为该api在23以后更改了
(例子代码github地址:https://github.com/watson-developer-cloud/speech-android-sdk)

8.新建的项目中添加依赖

9.主界面布局

10.授权

 
  1. if (initSTT() == false) {

  2. displayResult("Error: no authentication credentials/token available, please enter your authentication information");

  3. return;

  4. }

 
  1. private boolean initSTT() {

  2. // DISCLAIMER: please enter your credentials or token factory in the lines below

  3. String username = "173e235d-5d3d-453e-9d97-0b0f77bdac19"; //账号

  4. String password = "vjpqIUx0Ss8x"; //密码

  5. String serviceURL = "wss://stream.watsonplatform.net/speech-to-text/api"; //服务器地址

  6.  
  7. SpeechConfiguration sConfig = new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS);

  8. sConfig.learningOptOut = false; // Change to true to opt-out

  9.  
  10. SpeechToText.sharedInstance().initWithContext(this.getHost(serviceURL), context, sConfig); //设置服务器地址

  11. SpeechToText.sharedInstance().setCredentials(username, password); //设置账号和密码

  12.  
  13. SpeechToText.sharedInstance().setModel("en-US_BroadbandModel"); //设置所采集的语言

  14. SpeechToText.sharedInstance().setDelegate(this); //设置监听

  15. return true;

  16. }

11.检测服务是否连接

 
  1. if (jsonModels == null) {

  2. jsonModels = new STTCommands().doInBackground();

  3. if (jsonModels == null) {

  4. displayResult("Please, check internet connection."); //检测服务是否连接

  5. return;

  6. }

  7. }

}

12.重写监听中的方法

public class MainActivity extends Activity implements ISpeechDelegate    //activity实现了该监听,并复写监听中的方法
 
  1. @Override

  2. public void onOpen() { //认证开始和服务器连接上

  3. Log.d(TAG, "onOpen");

  4. mState = ConnectionState.CONNECTED;

  5. }

  6.  
  7. @Override

  8. public void onError(String error) { //连接出错

  9. Log.e(TAG, "onError...................." + error);

  10. mState = ConnectionState.IDLE;

  11. }

  12.  
  13. @Override

  14. public void onClose(int code, String reason, boolean remote) { //连接关闭

  15. Log.d(TAG, "onClose, code: " + code + " reason: " + reason);

  16. mState = ConnectionState.IDLE;

  17. }

  18.  
  19. @Override

  20. public void onMessage(String message) { //获取到识别到的信息并显示到界面上

  21. Log.d(TAG, "onMessage, message: " + message);

  22. try {

  23. JSONObject jObj = new JSONObject(message);

  24. // state message

  25. if (jObj.has("state")) {

  26. Log.d(TAG, "Status message: " + jObj.getString("state"));

  27. }

  28. // results message

  29. else if (jObj.has("results")) {

  30. //if has result

  31. Log.d(TAG, "Results message: ");

  32. JSONArray jArr = jObj.getJSONArray("results");

  33. for (int i = 0; i < jArr.length(); i++) {

  34. JSONObject obj = jArr.getJSONObject(i);

  35. JSONArray jArr1 = obj.getJSONArray("alternatives");

  36. String str = jArr1.getJSONObject(0).getString("transcript");

  37. // remove whitespaces if the language requires it

  38. // String model = "en-US_MichaelVoice";

  39. String model = "en-US_BroadbandModel";

  40. if (model.startsWith("ja-JP") || model.startsWith("zh-CN")) {

  41. str = str.replaceAll("\\s+", "");

  42. }

  43. String strFormatted = Character.toUpperCase(str.charAt(0)) + str.substring(1);

  44. if (obj.getString("final").equals("true")) {

  45. String stopMarker = (model.startsWith("ja-JP") || model.startsWith("zh-CN")) ? "。" : ". ";

  46. mRecognitionResults += strFormatted.substring(0, strFormatted.length() - 1) + stopMarker;

  47.  
  48. displayResult(mRecognitionResults);

  49. } else {

  50. displayResult(mRecognitionResults + strFormatted);

  51. }

  52. break;

  53. }

  54. } else {

  55. displayResult("unexpected data coming from stt server: \n" + message);

  56. }

  57.  
  58. } catch (JSONException e) {

  59. Log.e(TAG, "Error parsing JSON");

  60. e.printStackTrace();

  61. }

  62. }

点击record,说Test,点击stop,然后数据返回并设置到text上

13.点击Record按钮时进行的操作

 
  1. btRecord.setOnClickListener(new View.OnClickListener() {

  2. @Override

  3. public void onClick(View v) {

  4. if (mState == ConnectionState.IDLE) {

  5. mState = ConnectionState.CONNECTING;

  6. Log.d(TAG, "onClickRecord: IDLE -> CONNECTING");

  7. mRecognitionResults = "";

  8. displayResult(mRecognitionResults);

  9. SpeechToText.sharedInstance().setModel("en-US_BroadbandModel"); //设置识别的语言是英语

  10. // start recognition

  11. new AsyncTask<Void, Void, Void>() {

  12. @Override

  13. protected Void doInBackground(Void... none) {

  14. SpeechToText.sharedInstance().recognize(); //开始识别

  15. return null;

  16. }

  17. }.execute();

  18. btRecord.setText("Connecting...");

  19. }

  20. }

  21. });


 

14.停止识别

 
  1. btStop.setOnClickListener(new View.OnClickListener() {

  2. @Override

  3. public void onClick(View v) {

  4. if (mState == ConnectionState.CONNECTED) {

  5. mState = ConnectionState.IDLE;

  6. Log.d(TAG, "onClickRecord: CONNECTED -> IDLE");

  7. SpeechToText.sharedInstance().stopRecognition(); //停止识别

  8. btRecord.setText("Record");

  9. }

  10. }

  11. });

  12. 注:1.build.gradle中的compile和build版本不要超过21

    2.服务器是国外的,需要连接vpn

    3.在回调方法中,是子线程,不要进行更改界面的操作,如果需要更新界面,需先运行在主线程

猜你喜欢

转载自blog.csdn.net/weixin_42600182/article/details/81104379