IBM的语音识别（IBM speech to text 语言转换成文字）

1.登陆网址https://www.ibm.com/watson/developercloud/speech-to-text.html并注册

2.打开网址https://console.ng.bluemix.net/catalog/?category=watson，点击界面左侧菜单中的Watson,并选择Speech ToText

3.点击界面最下方的创建按钮

4.点击新建凭证

5.输入名称，点击添加

6.点击查看凭证，即可看到账号、密码和请求地址（凭证创建完成后，不一定能马上使用，有时需要过一段时间，具体时间不确定，我当天申请的当天测试一直连接不上，两天后再测的时候就可以了）

7.创建一个工程，将speech-android-wrapper作为library导入，并添加到工程中，build.gradle中的SDK版本号不要超过23，否则会报org.apache.http找不到，因为该api在23以后更改了
（例子代码github地址：https://github.com/watson-developer-cloud/speech-android-sdk）

8.新建的项目中添加依赖

9.主界面布局

10.授权

if (initSTT() == false) {
displayResult("Error: no authentication credentials/token available, please enter your authentication information");
return;
}

private boolean initSTT() {
// DISCLAIMER: please enter your credentials or token factory in the lines below
String username = "173e235d-5d3d-453e-9d97-0b0f77bdac19"; //账号
String password = "vjpqIUx0Ss8x"; //密码
String serviceURL = "wss://stream.watsonplatform.net/speech-to-text/api"; //服务器地址
SpeechConfiguration sConfig = new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS);
sConfig.learningOptOut = false; // Change to true to opt-out
SpeechToText.sharedInstance().initWithContext(this.getHost(serviceURL), context, sConfig); //设置服务器地址
SpeechToText.sharedInstance().setCredentials(username, password); //设置账号和密码
SpeechToText.sharedInstance().setModel("en-US_BroadbandModel"); //设置所采集的语言
SpeechToText.sharedInstance().setDelegate(this); //设置监听
return true;
}

11.检测服务是否连接

if (jsonModels == null) {
jsonModels = new STTCommands().doInBackground();
if (jsonModels == null) {
displayResult("Please, check internet connection."); //检测服务是否连接
return;
}
}

12.重写监听中的方法

public class MainActivity extends Activity implements ISpeechDelegate    //activity实现了该监听，并复写监听中的方法

@Override
public void onOpen() { //认证开始和服务器连接上
Log.d(TAG, "onOpen");
mState = ConnectionState.CONNECTED;
}
@Override
public void onError(String error) { //连接出错
Log.e(TAG, "onError...................." + error);
mState = ConnectionState.IDLE;
}
@Override
public void onClose(int code, String reason, boolean remote) { //连接关闭
Log.d(TAG, "onClose, code: " + code + " reason: " + reason);
mState = ConnectionState.IDLE;
}
@Override
public void onMessage(String message) { //获取到识别到的信息并显示到界面上
Log.d(TAG, "onMessage, message: " + message);
try {
JSONObject jObj = new JSONObject(message);
// state message
if (jObj.has("state")) {
Log.d(TAG, "Status message: " + jObj.getString("state"));
}
// results message
else if (jObj.has("results")) {
//if has result
Log.d(TAG, "Results message: ");
JSONArray jArr = jObj.getJSONArray("results");
for (int i = 0; i < jArr.length(); i++) {
JSONObject obj = jArr.getJSONObject(i);
JSONArray jArr1 = obj.getJSONArray("alternatives");
String str = jArr1.getJSONObject(0).getString("transcript");
// remove whitespaces if the language requires it
// String model = "en-US_MichaelVoice";
String model = "en-US_BroadbandModel";
if (model.startsWith("ja-JP") || model.startsWith("zh-CN")) {
str = str.replaceAll("\\s+", "");
}
String strFormatted = Character.toUpperCase(str.charAt(0)) + str.substring(1);
if (obj.getString("final").equals("true")) {
String stopMarker = (model.startsWith("ja-JP") || model.startsWith("zh-CN")) ? "。" : ". ";
mRecognitionResults += strFormatted.substring(0, strFormatted.length() - 1) + stopMarker;
displayResult(mRecognitionResults);
} else {
displayResult(mRecognitionResults + strFormatted);
}
break;
}
} else {
displayResult("unexpected data coming from stt server: \n" + message);
}
} catch (JSONException e) {
Log.e(TAG, "Error parsing JSON");
e.printStackTrace();
}
}

点击record，说Test，点击stop，然后数据返回并设置到text上

13.点击Record按钮时进行的操作

btRecord.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
if (mState == ConnectionState.IDLE) {
mState = ConnectionState.CONNECTING;
Log.d(TAG, "onClickRecord: IDLE -> CONNECTING");
mRecognitionResults = "";
displayResult(mRecognitionResults);
SpeechToText.sharedInstance().setModel("en-US_BroadbandModel"); //设置识别的语言是英语
// start recognition
new AsyncTask<Void, Void, Void>() {
@Override
protected Void doInBackground(Void... none) {
SpeechToText.sharedInstance().recognize(); //开始识别
return null;
}
}.execute();
btRecord.setText("Connecting...");
}
}
});

14.停止识别

btStop.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
if (mState == ConnectionState.CONNECTED) {
mState = ConnectionState.IDLE;
Log.d(TAG, "onClickRecord: CONNECTED -> IDLE");
SpeechToText.sharedInstance().stopRecognition(); //停止识别
btRecord.setText("Record");
}
}
});
注：1.build.gradle中的compile和build版本不要超过21

2.服务器是国外的，需要连接vpn

3.在回调方法中，是子线程，不要进行更改界面的操作，如果需要更新界面，需先运行在主线程

IBM的语音识别（IBM speech to text 语言转换成文字）

猜你喜欢