Intelligent sign language digital real-time translation based on Android+OpenCV+CNN+Keras - deep learning algorithm application (including Python, ipynb engineering source code) + data set (4)


Insert image description here

Preface

This project relies on the Keras deep learning model and is designed to classify and recognize sign language in real time. To achieve this goal, the project incorporates relevant algorithms from the OpenCV library to capture the position of the hands, enabling real-time recognition of sign language in video streams and images.

First, the project uses algorithms from the OpenCV library to capture hand positions in video streams or images. This can involve technologies such as skin color detection, motion detection, or gesture detection to pinpoint sign language gestures.

Next, the project uses a CNN deep learning model to classify the captured sign language. After training, it can recognize different sign language gestures as specific categories or characters.

During the real-time recognition process, the sign language gestures in the video stream or image are passed to the CNN deep learning model, which makes inferences and recognizes the gestures into the corresponding categories. This enables the system to recognize sign language gestures in real time and convert them into text or other forms of output.

Overall, this project combines computer vision and deep learning technology to provide a real-time solution for sign language recognition. This is a beneficial tool for hearing-impaired people and sign language users to help them communicate and understand others more easily.

overall design

This part includes the overall system structure diagram and system flow chart.

Overall system structure diagram

The overall structure of the system is shown in the figure.

Insert image description here

System flow chart

The system flow is shown in the figure.

Insert image description here

Operating environment

This part includes Python environment, TensorFlow environment, Keras environment and Android environment .

Module implementation

This project includes 6 modules: data preprocessing, data enhancement, model construction, model training and preservation, model evaluation and model testing. The functions and related codes of each module are introduced below.

1. Data preprocessing

Download the corresponding data set on Kaggle. The download address is https://www.kaggle.com/ardamavi/sign-language-digits-dataset .

See blog for details .

2. Data enhancement

In order to facilitate the display of the effect of generated images and fine-tuning of parameters, this project does not use keras to directly train the generator. Instead, an enhanced data set is first generated and then used for model training.

See blog for details .

3. Model construction

After the data is loaded into the model, the model structure needs to be defined and the loss function optimized.

See blog for details .

4. Model training and saving

This section includes code related to model training and model saving.

See blog for details .

5. Model evaluation

Due to the lack of sign language recognition related models on the Internet, in order to facilitate the selection of the optimal model among various models and the optimization of the model, before the model is applied to the Android project, it is necessary to use the Python file on the PC device for preliminary running testing. In order to verify whether the sign language recognition strategy of this solution is feasible and select the optimal classification model.

See blog for details .

6. Model testing

After evaluating the feasibility of the overall model, apply the sign language recognition model to the Android Studio project to complete the APP. Specific steps are as follows.

1) Permission registration

First, register two activities to recognize sign language in videos and photos respectively; secondly, call the camera and AndroidManifest.xmlapply for camera permissions and storage permissions. In order to access the SD card, register FileProvider; finally specify the program entrance and startup method.

The relevant code is as follows:

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.example.cameraapp">
    <uses-permission android:name="android.permission.CAMERA" />
  	<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
    <uses-feature android:name="android.hardware.camera" />
    <application
        android:allowBackup="true"
        android:icon="@mipmap/ic_launcher"
        android:label="@string/app_name"
        android:roundIcon="@mipmap/ic_launcher_round"
        android:supportsRtl="true"
        android:theme="@style/AppTheme">
        <activity android:name=".MainActivity"
            android:screenOrientation="landscape">
            android:label="手语实时识别">
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />
                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
        <activity
            android:name="com.example.cameraapp.Second"
            android:label="图片识别">
            <intent-filter>
                <action android:name="com.litreily.SecondActivity"/>
                <category android:name="android.intent.category.DEFAULT"/>
            </intent-filter>
        </activity>
        <provider
            android:name="androidx.core.content.FileProvider"
            android:authorities="com.example.cameraapp.fileprovider"
            android:exported="false"
            android:grantUriPermissions="true">
            <meta-data
                android:name="android.support.FILE_PROVIDER_PATHS"
                android:resource="@xml/file_paths" />
        </provider>
    </application>
</manifest>

2) Model import

The operations related to model import are as follows.
(1) Insert the trained .pb file app/src/main/assets. If the assets directory does not exist, right-click the menu item main→new→Directoryand enter assets to create the directory.

(2) Create a new class PredictionTF.java, load the so library in this class, and call the TensorFlow model to obtain the prediction results.

(3) MainActivity.javaDeclare the model storage path in , create a PredictTF object, call the PredictionTF class, and enter the corresponding class for application. The relevant code is as follows:

//加载模型
String MODEL_FILE = "file:///android_asset/trained_model_imageDataGenerator.pb";  //模型地址
PredictTF tf =new PredictTF(getAssets(),MODEL_FILE);

(4) MainActivity.java的onCreate()Load the OpenCV library in the method. The relevant code is as follows:

//加载OpenCV库
private void staticLoadCVLibraries(){
    
    
    boolean load = OpenCVLoader.initDebug();
    if(load) {
    
    
        Log.i("CV", "Open CV Libraries loaded...");
    }
}

3) Overall model construction

Construct the corresponding functions required for prediction within PredictTF. The specific steps are as follows.

(1) Load the so library and declare the required attributes. The relevant code is as follows:

private static final String TAG = "PredictTF";
TensorFlowInferenceInterface tf;
static {
    
      //加载libtensorflow_inference.so库文件
    System.loadLibrary("tensorflow_inference");
    Log.e(TAG, "libtensorflow_inference.so库加载成功");
}
//PATH TO OUR MODEL FILE AND NAMES OF THE INPUT AND OUTPUT NODES
//各节点名称
private String INPUT_NAME = "conv2d_1_input";
private String OUTPUT_NAME = "output_1";
float[] PREDICTIONS = new float[10];   //手语分类模型的输出
private int[] INPUT_SIZE = {
    
    64, 64, 1};   //模型的输入形状

(2) Use OpenCV to capture hand position

Blacken the areas that do not meet the skin color detection threshold. The relevant code is as follows:

      //肤色识别
private Mat skin(Mat frame) {
    
    
    int iLowH = 0;
    int iHighH = 20;
    int iLowS = 40;
    int iHighS = 255;
    int iLowV = 80;
    int iHighV = 255;
    Mat hsv=new Mat();
    Imgproc.cvtColor(frame,hsv,Imgproc.COLOR_RGBA2BGR);  //将输入图片转为灰度图
    Imgproc.cvtColor(hsv,hsv,Imgproc.COLOR_BGR2HSV);
    Mat skinMask=new Mat();
    Core.inRange(hsv, new Scalar(iLowH, iLowS, iLowV), new Scalar(iHighH, iHighS, iHighV),skinMask);
    Imgproc.GaussianBlur(skinMask,skinMask,new Size(5,5),0);  //高斯滤波
    Mat skin=new Mat();
    bitwise_and(frame,frame,skin,skinMask);  //将不符合肤色阈值的区域涂黑
    return skin;
}

In the predict function, the extracted area is subjected to Gaussian filtering and binarization, and the findContour function is used for contour extraction. Compare the contour size and ignore connected domains with an area smaller than the threshold. Finally, use the boundingRect function to extract the original image.

The relevant code is as follows:

//载入位图并转为openCV的Mat对象
Bitmap bitmap = BitmapFactory.decodeFile(imagePath);
Mat frame = new Mat();
if (imagePath == null) {
    
    System.out.print("imagePath is null");}
Utils.bitmapToMat(bitmap, frame);
Core.flip(frame,frame,1);   //水平镜像翻转
//图片预处理
frame=skin(frame);          //肤色识别
//Mat frame1=new Mat();
Imgproc.cvtColor(frame,frame,Imgproc.COLOR_BGR2GRAY);       //转化为灰度图
Mat frame1=new Mat();
Imgproc.GaussianBlur(frame,frame1,new Size(5,5),0);      //高斯滤波
double res=Imgproc.threshold(frame1, frame1,50, 255, Imgproc.THRESH_BINARY);      //二值化
//Imgproc.cvtColor(frame,frame,);
//提取所有轮廓
List<MatOfPoint> contours=new ArrayList<MatOfPoint>();
Mat hierarchy = new Mat();
Imgproc.findContours(frame1, contours, hierarchy, 
Imgproc.RETR_EXTERNAL, 
Imgproc.CHAIN_APPROX_SIMPLE);       //轮廓提取
//找到最大区域并填充
int max_idx= 0;
List<Double> area=new ArrayList<>();
for (int i=0;i < contours.size();i++) {
    
    
    area.add(Imgproc.contourArea((contours).get(i)));
    //max_idx = area.indexOf(Collections.max(area));
}
max_idx = area.indexOf(Collections.max(area));
//得到矩形
Rect rect= Imgproc.boundingRect(contours.get(max_idx));
//提取相关区域
String mess;
mess= String.valueOf(frame.channels());
Log.i("CV", "the type of frame is:"+mess);
Mat chepai_raw = new Mat(frame,rect);    //提取相关区域
Mat cheapi=new Mat();
Core.flip(chepai_raw,cheapi,1);       //水平镜面反转

(3) Input the area extracted by OpenCV into the sign language classification model

Convert the extracted bitmpa bitmap into a one-dimensional array for model input. The relevant code is as follows:

Bitmap input = Bitmap.createBitmap(cheapi.cols(), 
cheapi.rows(), 
Bitmap.Config.ARGB_8888);
Utils.matToBitmap(cheapi,input);
float[] input_data=bitmapToFloatArray(input,64,64);

Bitmap.createBitmap()The code related to the function definition is as follows:

    //将bitmap转化为一维浮点数组
   //本函数修改自https://blog.csdn.net/chaofeili/article/details/89374324
   public static float[] bitmapToFloatArray(Bitmap bitmap,int rx, int ry){
    
    
        int height = bitmap.getHeight();
        int width = bitmap.getWidth();
        //计算缩放比例
        float scaleWidth = ((float) rx) / width;
        float scaleHeight = ((float) ry) / height;
        Matrix matrix = new Matrix();
        matrix.postScale(scaleWidth, scaleHeight);
        bitmap=Bitmap.createBitmap(bitmap,0,0,width,height, matrix, true);
        height = bitmap.getHeight();
        width = bitmap.getWidth();
        float[] result = new float[height*width];
        int k = 0;
        //行优先
        for(int j = 0;j < height;j++){
    
    
            for (int i = 0;i < width;i++){
    
    
                int argb = bitmap.getPixel(i,j);
                int r = Color.red(argb);
                int g = Color.green(argb);
                int b = Color.blue(argb);
                int a = Color.alpha(argb);
                //由于是灰度图,所以r,g,b分量是相等的
                assert(r==g && g==b);
//Log.i(TAG,i+","+j+":argb = "+argb+", a="+a+", r="+r+", g="+g+", b="+b);
                result[k++] = r / 255.0f;
            }
        }
        return result;
    }

Since the original data set has label errors, it is necessary to provide a correct mapping dictionary for label error correction. The relevant code is as follows:

Map<Integer, Integer> label_map=new HashMap<Integer, Integer>();
label_map.put(0,9);label_map.put(1,0);label_map.put(2,7);
label_map.put(3,6);label_map.put(4,1);
label_map.put(5,8);label_map.put(6,4);
label_map.put(7,3);label_map.put(8,2);label_map.put(9,5);     //标签纠错

Start applying the model and get the predicted results. The relevant code is as follows:

    //开始应用模型
tf = new TensorFlowInferenceInterface(getAssets(),MODEL_PATH);   //加载模型
tf.feed(INPUT_NAME, input_data, 1, 64, 64, 1);
tf.run(new String[]{
    
    OUTPUT_NAME});
//copy the output into the PREDICTIONS array
tf.fetch(OUTPUT_NAME,PREDICTIONS);
//Obtained highest prediction
Object[] results = argmax(PREDICTIONS);  //找到预测置信度最大的类作为分类结果
int class_index = (Integer) results[0];
float confidence = (Float) results[1];
/*try{
    final String conf = String.valueOf(confidence * 100).substring(0,5);
    //Convert predicted class index into actual label name
    final String label = ImageUtils.getLabel(getAssets().open("labels.json"),class_index);
    //Display result on UI
    runOnUiThread(new Runnable() {
        @Override
        public void run() {
            resultView.setText(label + " : " + conf + "%");
        }
    });
} catch (Exception e){
}//*/
int label=label_map.get(class_index);         //标签纠错
results [0]=label;

The relevant code for the prediction function is as follows:

private Object[] perdict (Bitmap bitmap){
    
    
    //载入位图并转为openCV的Mat对象
    //Bitmap bitmap = BitmapFactory.decodeFile(imagePath);
    Mat frame = new Mat();
    //if (imagePath == null) {System.out.print("imagePath is null");}
    Utils.bitmapToMat(bitmap, frame);
    //Core.flip(frame,frame,1);   //水平镜像翻转
    //图片预处理
    Mat frame2=new Mat();
    frame2=skin(frame);          //肤色识别
    Imgproc.cvtColor(frame2,frame2,Imgproc.COLOR_BGR2GRAY);  //转化为灰度图
    Mat frame1=new Mat();
    Imgproc.GaussianBlur(frame2,frame1,new Size(5,5),0);      //高斯滤波
    double res=Imgproc.threshold(frame1, frame1,50, 255, Imgproc.THRESH_BINARY);      //二值化
    //提取所有轮廓
    List<MatOfPoint> contours=new ArrayList<MatOfPoint>();
    Mat hierarchy = new Mat();
    Imgproc.findContours(frame1, contours, hierarchy, Imgproc.RETR_EXTERNAL, Imgproc.CHAIN_APPROX_SIMPLE);       //轮廓提取
    //找到最大区域并填充
    int max_idx= 0;
    List<Double> area=new ArrayList<>();
    for (int i=0;i < contours.size();i++) {
    
    
        area.add(Imgproc.contourArea((contours).get(i)));
       }
    max_idx = area.indexOf(Collections.max(area));
    //得到矩形
    Rect rect= Imgproc.boundingRect(contours.get(max_idx));
    //提取相关区域
    String mess;
    Imgproc.cvtColor(frame,frame,Imgproc.COLOR_BGR2GRAY);//将原图转化为灰度图
    mess= String.valueOf(frame.channels());
    Log.i("CV", "the type of frame is:"+mess);
    Mat chepai_raw = new Mat(frame,rect);
    mess= String.valueOf(chepai_raw.channels());
    Log.i("CV", "the type of chepai_raw is:"+mess);
    Mat cheapi=new Mat();
    Core.flip(chepai_raw,cheapi,1);       //水平镜面反转
    //将提取的区域转为一维浮点数组输入模型
    Bitmap input = Bitmap.createBitmap(cheapi.cols(), cheapi.rows(), Bitmap.Config.ARGB_8888);
    Utils.matToBitmap(cheapi,input);
    float[] input_data=bitmapToFloatArray(input,64,64);
    Map<Integer, Integer> label_map=new HashMap<Integer, Integer>();
    //{0:9,1:0, 2:7, 3:6, 4:1, 5:8, 6:4, 7:3, 8:2, 9:5};
    label_map.put(0,9);label_map.put(1,0);label_map.put(2,7);label_map.put(3,6);label_map.put(4,1);
    label_map.put(5,8);label_map.put(6,4);label_map.put(7,3);label_map.put(8,2);label_map.put(9,5);     //标签纠错
    //开始应用模型
    tf=new TensorFlowInferenceInterface(getAssets(),MODEL_PATH);  //加载模型
    tf.feed(INPUT_NAME, input_data, 1, 64, 64, 1);
    tf.run(new String[]{
    
    OUTPUT_NAME});
    //copy the output into the PREDICTIONS array
    tf.fetch(OUTPUT_NAME,PREDICTIONS);
    //选择预测结果中置信度最大的类别
    Object[] results = argmax(PREDICTIONS);
    int class_index = (Integer) results[0];
    float confidence = (Float) results[1];
    /*try{
        final String conf = String.valueOf(confidence * 100).substring(0,5);
        //将预测的结果转化为0~9的标签
        final String label = ImageUtils.getLabel(getAssets().open("labels.json"),class_index);
        //显示结果
        runOnUiThread(new Runnable() {
            @Override
            public void run() {
                resultView.setText(label + " : " + conf + "%");
            }
        });
    } catch (Exception e){
    }
    int label=label_map.get(class_index);         //标签纠错
    results [0]=label;
    return results;
}

4) Process the preview frame data in the video

Processing preview frames is to process each frame of data during camera preview and identify the sign language type in a specific frame. The specific steps are as follows:

CameraPreview(1) Create new classes and ProcessWithThreadPoolclasses required to process preview frames

(2) Bind the event when starting to preview the camera video

Modify , add the code to initialize the camera preview in mainActivitythe new method , and finally stop the camera preview in the method. When the camera preview is removed, the relevant end method in the class will be triggered to close the camera preview. The relevant code is as follows:startPreview()stopPreview()removeAllViews()CameraPreview

public void startPreview() {
    
    
        //加载模型
StringMODEL_FILE="file:///android_asset/trained_model_imageDataGenerator.pb";
       //模型地址
        PredictTF tf =new PredictTF(getAssets(),MODEL_FILE);
        //新建相机预览对象
        final CameraPreview mPreview = new CameraPreview(this,tf);
        //新建可视布局
        FrameLayout preview = (FrameLayout) findViewById(R.id.camera_preview);
        preview.addView(mPreview);
        //调用并初始化摄像头
        SettingsFragment.passCamera(mPreview.getCameraInstance());
        PreferenceManager.setDefaultValues(this, R.xml.preferences, false);  SettingsFragment.setDefault(PreferenceManager.getDefaultSharedPreferences(this));        SettingsFragment.init(PreferenceManager.getDefaultSharedPreferences(this));
        //设置开始按钮监听器
        Button buttonSettings = (Button) findViewById(R.id.button_settings);
        buttonSettings.setOnClickListener(new View.OnClickListener() {
    
    
            @Override
            public void onClick(View v) {
    
    
getFragmentManager().beginTransaction().replace(R.id.camera_preview, new SettingsFragment()).addToBackStack(null).commit();
            }
        });
    }
public void stopPreview() {
    
    
    //移除相机预览界面
    FrameLayout preview = (FrameLayout) findViewById(R.id.camera_preview);
    preview.removeAllViews();
}

In mainActivity, onCreate()two buttons are used in the method to bind the above method. The relevant code is as follows:

protected void onCreate(Bundle savedInstanceState) {
    
    
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
    //加载OpenCV库
    staticLoadCVLibraries();  
    //绑定相机预览开始按钮
    Button buttonStartPreview = (Button) findViewById(R.id.button_start_preview);
    buttonStartPreview.setOnClickListener(new View.OnClickListener() {
    
    
        @Override
        public void onClick(View v) {
    
    
            startPreview();
        }
    });
    //绑定相机预览停止按钮
    Button buttonStopPreview = (Button) findViewById(R.id.button_stop_preview);
    buttonStopPreview.setOnClickListener(new View.OnClickListener() {
    
    
        @Override
        public void onClick(View v) {
    
    
            stopPreview();
        }
    });
}

(3) Real-time processing of frame data

CameraPreviewAdd an interface declaration in the class to implement the method Camera.PreviewCallbackunder this interface . This method is used to obtain the data of each frame. onPreviewFrame()The relevant code is as follows:

public class CameraPreview extends SurfaceView implements SurfaceHolder.Callback, Camera.PreviewCallback{
    
    }

Implement the onPreviewFrame() method and call the PredictTF class for sign language recognition. The relevant code is as follows:

public void onPreviewFrame(byte[] data, Camera camera) {
    
    
    switch (processType) {
    
    
        case PROCESS_WITH_HANDLER_THREAD:
            processFrameHandler.obtainMessage(ProcessWithHandlerThread.WHAT_PROCESS_FRAME, data).sendToTarget();
            break;
        case PROCESS_WITH_QUEUE:
            try {
    
    
                frameQueue.put(data);
            } catch (InterruptedException e) {
    
    
                e.printStackTrace();
            }
            break;
        case PROCESS_WITH_ASYNC_TASK:
            new ProcessWithAsyncTask().execute(data);
            break;
        case PROCESS_WITH_THREAD_POOL:
            processFrameThreadPool.post(data);
            try {
    
    
                //延时3s处理
                Thread.sleep(500);  //手掌位置捕捉的延时
                NV21ToBitmap transform = new NV21ToBitmap(getContext().getApplicationContext());
                Bitmap bitmap= transform.nv21ToBitmap(data,1920,1080);
                String num;
                Object[] results=tf.perdict(bitmap);
                if (results!=null) {
    
    
                    num = "result:" + results[0].toString() + "  confidence:" + results[1].toString();
                    Toast.makeText(getContext().getApplicationContext(), num, Toast.LENGTH_SHORT).show();
                    Thread.sleep(3000);  //如果监测到有效手语,则进行延时
                }
             } catch (InterruptedException e) {
    
    
                e.printStackTrace();
            }
            break;
        default:
            throw new IllegalStateException("Unexpected value: " + processType);
             //*/
    }
}

5) Process image data

This part includes calling the camera to obtain pictures, selecting pictures from the album, predicting and displaying pictures.

(1) Call the camera to obtain pictures.
The file object is used to store the photos taken by the camera. It is stored in the location of the current application cache data, and the getUriForFile()method is called to obtain this directory.

In order to be compatible with lower versions, subsequent judgment is made: if the system version is lower than Android 7.0, the Uri formFlie ()method is called to convert the File object into a Uri object; otherwise, the FileProvider object getUriForFile ()method is called to convert the File object into a Uri object.

Construct the Intent object, specify the action as android.media.action.IMAGE_CAPTURE, then call putExtra ()the method to specify the output address of the image, call the startActivityForResul () method to open the camera program, and return TAKE_PHOTOto the next step after success. The relevant code is as follows:

//创建File对象,用于存储拍照后的图片
    File outputImage = new File(getExternalCacheDir(), "output_image.jpg");
    try {
    
    
        if (outputImage.exists()) {
    
    
            outputImage.delete();
        }
        outputImage.createNewFile();
    } catch (IOException e) {
    
    
        e.printStackTrace();
    }
    if (Build.VERSION.SDK_INT < 24) {
    
    
        imageUri = Uri.fromFile(outputImage);
    } else {
    
    
        imageUri=FileProvider.getUriForFile(Second.this,"com.example.cameraapp.fileprovider", outputImage);
    }
    //启动相机程序
    Intent intent = new Intent("android.media.action.IMAGE_CAPTURE");
    intent.putExtra(MediaStore.EXTRA_OUTPUT, imageUri);
    startActivityForResult(intent, TAKE_PHOTO);
}

Add a new file_paths.xmlfile and the relevant code is as follows:

<?xml version="1.0" encoding="utf-8"?>
<paths xmlns:android="http://schemas.android.com/apk/res/android">
    <external-path name="my_images"  path="/" />
</paths>

Androidfest.xmlRegister fileProvider in the file, the relevant code is as follows :

<provider
    android:name="androidx.core.content.FileProvider"
    android:authorities="com.example.cameraapp.fileprovider"
    android:exported="false"
    android:grantUriPermissions="true">
    <meta-data
        android:name="android.support.FILE_PROVIDER_PATHS"
        android:resource="@xml/file_paths" />
</provider>

(2) Select pictures from the album

Apply for permission in AndroidManifest.xmlthe file WRITE_EXTERNAL_STORAGE, because selecting pictures from the album requires access to the phone's SD card data.

Create an Intent object, specify the action android.intent.action.GET_CONTENT, call startActivityForResult()the method to open the mobile phone album, and return it CHOOSE_PHOTOfor the next step of processing after success.

The relevant code is as follows:

private void openAlbum() {
    
    
    Intent intent = new Intent("android.intent.action.GET_CONTENT");
    intent.setType("image/*");
    startActivityForResult(intent, CHOOSE_PHOTO); // 打开相册
}

In order to be compatible with lower versions, use handleImageOnKitKat()the and handleImageBeforeKitKat()functions to process images from mobile phones above version 4.4 and below.

handleImageOnKitKat()It is the Uri object that was encapsulated before parsing. handleImageBeforeKitKat()Since the Uri is not encapsulated, you can directly pass the Uri object into getImagePath(). The relevant code is as follows:

private void handleImageOnKitKat(Intent data) {
    
    
    String imagePath = null;
    Uri uri = data.getData();
    Log.d("TAG", "handleImageOnKitKat: uri is " + uri);
    if (DocumentsContract.isDocumentUri(this, uri)) {
    
    
        // 如果是document类型的Uri,则通过document ID处理
        String docId = DocumentsContract.getDocumentId(uri);
        if("com.android.providers.media.documents".equals(uri.getAuthority())) {
    
    
            String id = docId.split(":")[1]; //解析数字格式的ID
            String selection = MediaStore.Images.Media._ID + "=" + id;
            imagePath = getImagePath(MediaStore.Images.Media.EXTERNAL_CONTENT_URI, selection);
        }
 else 
if ("com.android.providers.downloads.documents".equals(uri.getAuthority())) {
    
    
            Uri contentUri=ContentUris.withAppendedId(Uri.parse("content://downloads/public_downloads"), Long.valueOf(docId));
            imagePath = getImagePath(contentUri, null);
        }
    } else if ("content".equalsIgnoreCase(uri.getScheme())) {
    
    
        //如果是content类型的Uri,则使用普通方式处理
        imagePath = getImagePath(uri, null);
    } else if ("file".equalsIgnoreCase(uri.getScheme())) {
    
    
        //如果是file类型的Uri,直接获取图片路径即可
        imagePath = uri.getPath();
    }
    displayImage(imagePath); //根据图片路径显示图片
}
private void handleImageBeforeKitKat(Intent data) {
    
    
    Uri uri = data.getData();
    String imagePath = getImagePath(uri, null);
    displayImage(imagePath);
}

(3) Predict and display images

Use PredictTFobject-related methods to predict the gesture type in a picture bitmap object obtained from the camera or album, and setImageBitmapdisplay it using a function.

The relevant code is as follows:

protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    
    
    super.onActivityResult(requestCode,resultCode, data);
   switch (requestCode)
 {
    
    
        case TAKE_PHOTO:
            if (resultCode == RESULT_OK) {
    
    
                try {
    
      //将拍摄的照片显示出来并进行识别
Bitmapbitmap=BitmapFactory.decodeStream(getContentResolver().openInputStream(imageUri));
                    //识别图片
                    Object[] results=tf.perdict(bitmap);
                    String num;
                    num ="result:"+results[0].toString()+"  confidence:"+results[1].toString();
                    Toast.makeText(Second.this, num , Toast.LENGTH_SHORT).show();
                    //展示图片
                    picture.setImageBitmap(bitmap);
                } catch (Exception e) {
    
    
                    e.printStackTrace();
                }
            }
            break;
        case CHOOSE_PHOTO:
            if (resultCode == RESULT_OK) {
    
    
                //判断手机系统版本号
                if (Build.VERSION.SDK_INT >= 19) {
    
    
                    //4.4及以上系统使用这个方法处理图片
                    handleImageOnKitKat(data);
                } else {
    
    
                    //4.4以下系统使用这个方法处理图片
                    handleImageBeforeKitKat(data);
                }
            }
            break;
        default:
            break;
    }
}

6) Multiple page settings

After creating the category for prediction using videos and pictures, AndroidManifest.xmlregister two activities in the file. Among them, MainActivity.javavideos are used for prediction Second.javain and pictures are used for prediction.

The relevant code is as follows:

<activity android:name=".MainActivity"
    android:screenOrientation="landscape">
    android:label="手语实时识别">
    <intent-filter>
        <action android:name="android.intent.action.MAIN" />
        <category android:name="android.intent.category.LAUNCHER" />
    </intent-filter>
</activity>
<activity
    android:name="com.example.cameraapp.Second"
    android:label="图片识别">
    <intent-filter>
        <action android:name="com.litreily.SecondActivity"/>
        <category android:name="android.intent.category.DEFAULT"/>
    </intent-filter>
</activity>

Set jump buttons and listen events on each page, setClass()set the jump page through methods, and use startActivityForResult()methods to start the jump.

The relevant code is as follows:

//设置跳转按钮
Button buttonSettings = (Button) findViewById(R.id.button_settings);
buttonSettings.setOnClickListener(new View.OnClickListener() {
    
    
    @Override
    public void onClick(View v){
    
    
        Intent intent2 =new Intent(); //新建Intent对象
        intent2.setClass(MainActivity.this, Second.class);
        startActivityForResult(intent2,0);   //返回前一页
    }
});

7) Layout file code

The relevant code for the layout file is as follows:

activity.xml:

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="match_parent"
    android:layout_height="match_parent">
    <FrameLayout
        android:id="@+id/camera_preview"
        android:layout_width="0px"
        android:layout_height="fill_parent"
        android:layout_weight="1" />
    <LinearLayout
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_gravity="right"
        android:orientation="vertical">
        <Button
            android:id="@+id/button_settings"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="转到图片识别" />
        <Button
            android:id="@+id/button_start_preview"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="开始翻译" />
        <Button
            android:id="@+id/button_stop_preview"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="停止翻译" />
        <TextView
            android:id="@+id/info1"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="本次识别结果:" />
        <TextView
            android:id="@+id/textView"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content" />
        <TextView
            android:id="@+id/info2"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="手部区域:" />
        <ImageView
            android:id="@+id/picture"
            android:layout_width="100dp"
            android:layout_height="80dp"
            android:layout_gravity="center_horizontal" />
        <TextView
            android:id="@+id/info3"
            android:layout_width="wrap_content"
            android:layout_height="wrap_content"
            android:text="句子翻译:" />
        <TextView
            android:id="@+id/translated_statement"
            android:layout_width="200dp"
            android:layout_height="wrap_content"
            android:singleLine="true"
            android:ellipsize="start"/>
    </LinearLayout>
</LinearLayout>
second.xml:
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:orientation="vertical"
    android:layout_width="match_parent"
    android:layout_height="match_parent" >
    <Button
        android:id="@+id/take_photo"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="拍照识别" />
    <Button
        android:id="@+id/choose_from_album"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="从相册中选择" />
    <Button
        android:id="@+id/button_settings"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:text="转到实时翻译" />
    <EditText
        android:layout_width="match_parent"
        android:layout_height="wrap_content" />
    <ImageView
        android:id="@+id/picture"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_gravity="center_horizontal" />
</LinearLayout>

Related other blogs

Intelligent sign language digital real-time translation based on Android+OpenCV+CNN+Keras - deep learning algorithm application (including Python, ipynb engineering source code) + data set (1)

Intelligent sign language digital real-time translation based on Android+OpenCV+CNN+Keras - deep learning algorithm application (including Python, ipynb engineering source code) + data set (2)

Intelligent sign language digital real-time translation based on Android+OpenCV+CNN+Keras - deep learning algorithm application (including Python, ipynb engineering source code) + data set (3)

Intelligent sign language digital real-time translation based on Android+OpenCV+CNN+Keras - deep learning algorithm application (including Python, ipynb engineering source code) + data set (5)

Project source code download

For details, please see my blog resource download page


Download other information

If you want to continue to understand the learning routes and knowledge systems related to artificial intelligence, you are welcome to read my other blog " Heavyweight | Complete Artificial Intelligence AI Learning - Basic Knowledge Learning Route, all information can be downloaded directly from the network disk without following any routines.
This blog refers to Github’s well-known open source platform, AI technology platform and experts in related fields: Datawhale, ApacheCN, AI Youdao and Dr. Huang Haiguang, etc., which has nearly 100G of related information. I hope it can help all my friends.

Guess you like

Origin blog.csdn.net/qq_31136513/article/details/133077146