General
, And then obtains the current through Dlib face feature points by translating the feature point of the standard model fitting rotation, the difference between the feature point calculating standard model feature points obtained Dlib obtained using an iterative optimization Ceres continuously, finally obtained optimum rotation and translation parameters.
Android version is in principle the same C ++ Version: Head pose estimation - OpenCV / DLIB / Ceres .
It describes the problems encountered during migration.
Use of the environment
System environment: Ubuntu 18.04
Java environment: JRE 1.8.0
Language: C ++ (clang), Java
Compilation Tools: Android Studio 3.4.1
- CMake 3.10.2
- LLDB
- NDK 20.0
- LLDB
These tools can be downloaded in the Android Studio in the SDK management tools.
Third-party tools
DLIB : used to obtain facial features
Ceres : for non-linear optimization
Source
https://github.com/Great-Keith/head-pose-estimation/tree/master/android/landmark-fitting
Ready to work
Android interface to third-party libraries
Dlib
Available on GitHub use of ready-made interfaces: https://github.com/tzutalin/dlib-android
The project also provides specific examples of the app: https://github.com/tzutalin/dlib-android-app/
We do app that is built on top of the sample app.
Ceres
Specific use can be found in an essay before: Android platform Ceres Solver
Finally, we integrate Dlib short and Ceres got the basic framework of our app: https://github.com/Great-Keith/dlib-android-app
Increase conversion front camera
Additional conversion button
Initial sample dlib-android-app only the rear camera, which is inconvenient for a single test, we modify the code to implement a camera before and after the switching button.
First, find the camera view res/layout/camera_connection_fragment.xml
, an increase in the upper right corner Switch
button.
Finally, we find the implementation details of the app, through their own to create a new CameraConnectionFragment
class to replace the original Fragment, in order to achieve a series of operations. The class setUpCameraOutputs
method enables selection of the camera, which will facilitate all cameras available on mobile devices, preference rear camera.
The method for adding a boolean b
parameter for selecting the camera:
if(b) {
// 只使用后置摄像头
// If facing back camera or facing external camera exist, we won't use facing front camera
if (num_facing_back_camera != null && num_facing_back_camera > 0) {
// 前置摄像头跳过(如果有后置摄像头)
// We don't use a front facing camera in this sample if there are other camera device facing types
if (facing != null && facing == CameraCharacteristics.LENS_FACING_FRONT) {
continue;
}
}
} else {
// 只使用前置摄像头
if (num_facing_front_camera != null && num_facing_front_camera > 0) {
// 前置摄像头跳过(如果有后置摄像头)
// We don't use a front facing camera in this sample if there are other camera device facing types
if (facing != null && facing == CameraCharacteristics.LENS_FACING_BACK) {
continue;
}
}
}
Then in the initialization process associated with our Switch
button:
switchBtn = view.findViewById(R.id.cameraSwitch);
switchBtn.setOnCheckedChangeListener(new CompoundButton.OnCheckedChangeListener() {
@Override
public void onCheckedChanged(CompoundButton compoundButton, boolean b) {
closeCamera();
openCamera(textureView.getWidth(), textureView.getHeight(), b);
}
});
[NOTE] by openCamera
the parameter b
to the transmission setUpCameraOutputs
.
Inverted front camera repair
After modification we run the program, you will find the situation reversed window appears pre-show, so we need to show the pre-display window is flipped.
Find the camera to capture images processing class listener OnGetImageListener
, in which the screen capture function shall be handled drawResizedBitmap
, before the final drawing, increasing the matrix flip.
/* If using front camera, matrix should rotate 180 */
if(!switchBtn.isChecked()) {
matrix.postTranslate(-dst.getWidth() / 2.0f, -dst.getHeight() / 2.0f);
matrix.postRotate(180);
matrix.postTranslate(dst.getWidth() / 2.0f, dst.getHeight() / 2.0f);
}
final Canvas canvas = new Canvas(dst);
canvas.drawBitmap(src, matrix, null);
[NOTE] no way to directly get in the class CameraId
to tell where the camera is currently using the front or rear, so we have to be judged by the Switch button in front. Check may be able to use the Camera
class after the API 21 out of use.
Main course
Or in the camera's listeners them, we can see the feature points dlib obtain data and draw.
mInferenceHandler.post(
new Runnable() {
@Override
public void run() {
// ...
long startTime = System.currentTimeMillis();
List<VisionDetRet> results;
synchronized (OnGetImageListener.this) {
results = mFaceDet.detect(mCroppedBitmap);
}
long endTime = System.currentTimeMillis();
mTransparentTitleView.setText("Time cost: " + String.valueOf((endTime - startTime) / 1000f) + " sec");
// Draw on bitmap
if (results != null) {
for (final VisionDetRet ret : results) {
// 绘制人脸框和特征点
// ...
}
}
}
mWindow.setRGBBitmap(mCroppedBitmap);
mIsComputing = false;
}
});
We choose to draw the face frame and feature point for
increase optimization cycle.
首先将特征点复制一份Point
数组,用于作为传入参数。
/* Transform landmarks to array, which is needed by JNI */
Point[] tmp = landmarks.toArray(new Point[0]);
初始化好double x[]
随后我们可以调用我们的CeresSolver
类来进行处理,得到的最优解通过指针x
返回。
CeresSolver.solve(x, tmp);
最后我们再调用两个方法来进行将三维特征点转化为二维的映射。
Point3f[] points3f = CeresSolver.transform(x);
Point[] points2d = CeresSolver.transformTo2d(points3f);
[NOTE] 项目中的二维点使用android.graphics.Point
(对应C++中使用的dlib::point
),而三维点使用我们自己建的一个类Point3f
(对应C++中使用的dlib::vector<double, 3>
)。
综上,我们实际上要实现的是一个提供Ceres
支持的工具类CeresSolver
,下面具体描述。
CeresSolver类与其JNI接口
初始化
我们需要读取标准模型特征点的三维坐标,该坐标存储于landmarks.txt
文件中。对于Android工程,我们将该文件放在assets
目录下。在CameraActivity
初始化onCreate
的时候顺带进行初始化:
CeresSolver.init(getResources().getAssets().open("landmarks.txt"));
该初始化具体过程如下:
public static void init(InputStream in) {
try {
InputStreamReader inputReader = new InputStreamReader(in);
BufferedReader bufReader = new BufferedReader(inputReader);
String line;
int i = 0;
while((line = bufReader.readLine()) != null) {
String[] nums = line.split(" ");
modelLandmarks[i] = new Point3f(Double.valueOf(nums[0]),
Double.valueOf(nums[1]),
Double.valueOf(nums[2]));
i++;
}
} catch (Exception e) {
Log.e(TAG, "Loading model landmarks from file failed.");
e.printStackTrace();
}
Log.i(TAG, "Loading model landmarks from file succeed.");
init_();
}
init_
是一个JNI
的函数,用于将CeresSolver
类中读取的modelLandmark
数据读取到本地变量``model_landmark
,并提前读取一些jmethodID
和jfieldID
。
[NOTE] 其实也可以通过调用jmethodID
或者jfieldID
来获得Java类中的modelLandmark
,但我目前不是很清楚两种方法之间在效率上的差异。
[NOTE] 将这些数据提前在cpp文件中读取并保存成静态变量,这个过程有一些问题,由于Java的垃圾回收机制,JNI中的静态类型,有些会失去关联(可能是指针?)。比如jfieldID
的调用往往没有问题,但是jclass
就会失效,因此jclass
类型无法提前先初始化好。
解决最小二乘
同C++一样,提前定义好CostFunctor
:
struct CostFunctor {
public:
explicit CostFunctor(JNIEnv *_env, jobjectArray _shape){
env = _env;
shape = _shape; }
bool operator()(const double* const x, double* residual) const {
/* Init landmarks to be transformed */
fitting_landmarks.clear();
for (auto &model_landmark : model_landmarks)
fitting_landmarks.push_back(model_landmark);
transform(fitting_landmarks, x);
std::vector<Point2d> model_landmarks_2d;
landmarks_3d_to_2d(fitting_landmarks, model_landmarks_2d);
/* Calculate the energe (Euclid distance from two points) */
for(unsigned long i=0; i<LANDMARK_NUM; i++) {
jobject point = env->GetObjectArrayElement(shape, static_cast<jsize>(i));
long tmp1 = env->GetIntField(point, getX2d) - model_landmarks_2d.at(i).x;
long tmp2 = env->GetIntField(point, getY2d) - model_landmarks_2d.at(i).y;
residual[i] = sqrt(tmp1 * tmp1 + tmp2 * tmp2);
}
return true;
}
private:
JNIEnv *env;
jobjectArray shape; /* 3d landmarks coordinates got from dlib */
};
基本与C++相同,唯一不同的地方是shape
的类型直接使用的JNI中的类型jobjectArray
,并且需要使用到调用,因此需要在初始化的时候导入JNIEnv
环境。
其余在调用部分就和C++部分基本相同,所有的JNI函数都需要注意在参数传入和传出的时候进行类型的转变。
坐标转化
涉及三维点旋转和平移的转化以及三维点转二维点的转化,同C++中的涉及。
需要另外提供JNI接口给Java中的类使用,主要涉及jobject
的方法调用、成员访问等等。当然,也可以在Java中实现这些方法,感觉效率会更高一些。这一部分具体可以看源代码,其中有详细的注释。
信息打印(Debug)
在Android项目中,输出的消息很多,debug的难度是比较大的,因此需要灵活使用打印信息来获得所需要的信息。其中Java程序中可以使用android.util.Log
来进行输出,可以在logcat
或者run
中进行查看。具体比如:
Log.i(TAG, String.format("After Solve x: %f %f %f %f %f %f",
x[0], x[1], x[2], x[3], x[4], x[5]));
在JNI
的cpp文件中,定义如下宏定义来进行输出:
#define TAG "CERES-CPP"
#define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG, TAG,__VA_ARGS__)
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO, TAG,__VA_ARGS__)
#define LOGW(...) __android_log_print(ANDROID_LOG_WARN, TAG,__VA_ARGS__)
#define LOGE(...) __android_log_print(ANDROID_LOG_ERROR, TAG,__VA_ARGS__)
#define LOGF(...) __android_log_print(ANDROID_LOG_FATAL, TAG,__VA_ARGS__)
使用该Log
需要在CMakeLists.txt
中需要链接log
库。
结果测试
进入相机界面,并进行摄像头的切换。
这边可以看到,刚打开的时候,这个求解得到的点是非常混乱的,这是由于初始值没有设置好,在经过一段时间后就会进入正常状态。
实时效果
总结
因为整体逻辑在C++已经实现了,所以复制这个逻辑的过程并不困难。难点主要是在JNI的使用上,没有接触过NDK的我在将Ceres移植到安卓平台上花费了大量的时间,最后写了Android平台使用Ceres Solver总结了这个过程。当这一部分完成之后,后面的过程就快了起来,但关于JNI的很多特性,跟Java息息相关,还需要更多的摸索。
进一步可以优化
- 初始值选择问题;
- 去除app中的识别行人模块;
- 优化使用Ceres求解最小二乘的过程;
- 前后摄像头显示区别;
- 优化接口,使其更据扩展性。