Generally, C++ is used to control Kinect, and Python is still relatively rare, but there is still a need to use Python to control Kinect cameras. After all, Python is simpler and can further reduce the development threshold.

development environment

Windows system

Python 3.8 or lower, the Python version cannot be higher than 3.8

Open3D 0.13.0

Open3D – A Modern Library for 3D Data Processing

Azure kinect SDK 1.4.1

Azure-Kinect-Sensor-SDK/usage.md at develop · microsoft/Azure-Kinect-Sensor-SDK · GitHub

connect camera

To create a recorder object, two parameters are required—the sensor and device number; to create a sensor object, one parameter—configuration information is required.

import open3d

config = o3d.io.AzureKinectSensorConfig() # 创建配置对象
sensor = o3d.io.AzureKinectSensor(config) # 创建传感器对象
device = 0 # 设备编号
recorder = o3d.io.AzureKinectRecorder(config, device) # 创建记录器对象

Configure the camera parameters through the open3d.io.AzureKinectSensorConfig object. The above example uses the default parameters. The specific method of configuring the camera parameters will be introduced below; the device number is generally 0. If there are multiple Kinects connected to the PC, set it as needed different numbers.

Modify camera configuration

Read configuration file and generate configuration object.

import open3d

filename = "./myconfig.json" # 配置文件
myconfig = o3d.io.read_azure_kinect_sensor_config(filename) # 生成配置对象
device = 0 # 设备编号
recorder = o3d.io.AzureKinectRecorder(myconfig, device) # 生成记录器对象

Read the configuration file through the open3d.io.read_azure_kinect_sensor_config() method, which returns an open3d.io.AzureKinectSensorConfig object.

The configuration file is a file in json format, and the default configuration file is given below (the camera configuration of all examples in this article will use the default configuration):

{
    "camera_fps" : "K4A_FRAMES_PER_SECOND_30",
    "color_format" : "K4A_IMAGE_FORMAT_COLOR_MJPG",
    "color_resolution" : "K4A_COLOR_RESOLUTION_720P",
    "depth_delay_off_color_usec" : "0",
    "depth_mode" : "K4A_DEPTH_MODE_WFOV_2X2BINNED",
    "disable_streaming_indicator" : "false",
    "subordinate_delay_off_master_usec" : "0",
    "synchronized_images_only" : "false",
    "wired_sync_mode" : "K4A_WIRED_SYNC_MODE_STANDALONE"
}

For the meaning of these configurations and the parameters that can be filled in, please refer to the Microsoft Azure Kinect Sensor SDK documentation: Azure Kinect Sensor SDK: Enumerations (microsoft.github.io)

record video

To use the recorder object to record video, you need to initialize the sensor first, then specify the output path, and then open the recorder. Call the record frame method of the recorder to record the current frame, and call this method repeatedly to record continuously to form a video. After the video recording is finished, the recorder should be turned off.

import open3d

config = o3d.io.AzureKinectSensorConfig() # 创建默认的配置对象
sensor = o3d.io.AzureKinectSensor(config) # 创建传感器对象
device = 0 # 设备编号
recorder = o3d.io.AzureKinectRecorder(config, device) # 创建记录器对象

recorder.init_sensor() # 初始化传感器
recorder.open_record(vedio_path) # 开启记录器
fps = 300
for i in range(fps): # 循环调用记录器的记录帧方法300次
    rgbd = recorder.record_frame(enable_record = True, 
                                        enable_align_depth_to_color = True)
recorder.close_record() # 关闭记录器

The above example uses the default configuration, and the recording frame rate of the camera is 30fps, so recording 300 frames is 10 seconds of video.

The record_frame() method of the open3d.io.AzureKinectRecorder object has two parameters - enable_record controls whether to write the current frame into the video, generally this parameter is True; enable_align_depth_to_color controls whether to align the depth information with the color information, which will be generated at the point play a role in the operation of the cloud. At the same time, the return value of the record_frame() method is an open3d.geometry.RGBDImage object, and the return value of the record_frame() method will play a role in the operation of generating the point cloud.

read video

The recorded video can be read by the reader, and then each frame of it is processed.

import open3d

reader = open3d.io.AzureKinectMKVReader() # 创建阅读器
path = "./video.mkv"
reader.open(path) # 打开视频文件

while not reader.is_eof(): # 判断视频是否全部读完
    rgbd = reader.next_frame() # 获取下一帧
    pass # 对这一帧进行处理

The return value of the next_frame() method read by the open3d.io.AzureKinectMKVReader object is an open3d.geometry.RGBDImage object, that is to say, reading each frame of the video with open3d.io.AzureKinectMKVReader is actually reading each open3d in the video .geometry.RGBDImage object, the return value of the next_frame() method will play a role in the operation of generating the point cloud.

generate point cloud

The frame is recorded by the recorder, and an open3d.geometry.RGBDImage object will be generated every time the next frame is recorded; the video is read by the reader, and an open3d.geometry.RGBDImage object will be generated every time a frame is read. Point clouds can be generated using open3d.geometry.RGBDImage objects.

import open3d

ply = open3d.geometry.PointCloud.create_from_rgbd_image(rgbd, 
                                open3d.camera.PinholeCameraIntrinsic(1280, 720,
                                                                                                           601.1693115234375, 600.85931396484375, 
                                                                                                           637.83624267578125, 363.8018798828125))
plypath = "./pointcloud.ply"
open3d.io.write_point_cloud(plypath, ply)

Use the open3d.geometry.PointCloud.create_from_rgbd_image() method to generate an open3d.geometry.PointCloud object, which is a point cloud object, from an open3d.geometry.RGBDImage object. This method requires four parameters - open3d.geometry.PointCloud object, open3d.camera.PinholeCameraIntrinsic object, extrinsic and project_valid_depth_only, the first two parameters are RGBDImage object and camera internal reference object, the last two parameters have default values, generally do not need to be modified . The internal reference of the camera will be introduced in the later part, so ignore it for now.

The return value of the open3d.geometry.PointCloud.create_from_rgbd_image() method is an open3d.geometry.PointCloud object, which is a point cloud object. Use the open3d.io.write_point_cloud() method to generate a point cloud file in the computer disk and save the point cloud object in it. The open3d.io.write_point_cloud() method has five parameters, but only two parameters are introduced here - the output file and the open3d.geometry.PointCloud object. The open3d.io.write_point_cloud() method will automatically convert the point cloud according to the extension of the output file The object is saved in the corresponding format. The open3d.io.write_point_cloud() method supports six point cloud formats—xyz, xyzn, xyzrgb, pts, ply, and pcd.

camera reference

This needs to be determined by yourself. The internal parameters of each camera are different, and you cannot directly copy other people's parameters.

This step requires writing C++ code, and also needs to pay attention to some details; of course, the steps of using VS to build the Azure Kinect development environment will not be introduced here.

#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>
#include <vector>
#include <k4a/k4a.h>
using namespace std;

static string get_serial(k4a_device_t device)
{
    size_t serial_number_length = 0;

    if (K4A_BUFFER_RESULT_TOO_SMALL != k4a_device_get_serialnum(device, NULL, &serial_number_length))
    {
        cout << "Failed to get serial number length" << endl;
        k4a_device_close(device);
        exit(-1);
    }

    char* serial_number = new (std::nothrow) char[serial_number_length];
    if (serial_number == NULL)
    {
        cout << "Failed to allocate memory for serial number (" << serial_number_length << " bytes)" << endl;
        k4a_device_close(device);
        exit(-1);
    }

    if (K4A_BUFFER_RESULT_SUCCEEDED != k4a_device_get_serialnum(device, serial_number, &serial_number_length))
    {
        cout << "Failed to get serial number" << endl;
        delete[] serial_number;
        serial_number = NULL;
        k4a_device_close(device);
        exit(-1);
    }

    string s(serial_number);
    delete[] serial_number;
    serial_number = NULL;
    return s;
}

static void print_calibration()
{
    uint32_t device_count = k4a_device_get_installed_count();
    cout << "Found " << device_count << " connected devices:" << endl;
    cout << fixed << setprecision(6);

    for (uint8_t deviceIndex = 0; deviceIndex < device_count; deviceIndex++)
    {
        k4a_device_t device = NULL;
        if (K4A_RESULT_SUCCEEDED != k4a_device_open(deviceIndex, &device))
        {
            cout << deviceIndex << ": Failed to open device" << endl;
            exit(-1);
        }

        k4a_calibration_t calibration;

        k4a_device_configuration_t deviceConfig = K4A_DEVICE_CONFIG_INIT_DISABLE_ALL;
        deviceConfig.color_format = K4A_IMAGE_FORMAT_COLOR_MJPG;
        deviceConfig.color_resolution = K4A_COLOR_RESOLUTION_720P;
        deviceConfig.depth_mode = K4A_DEPTH_MODE_WFOV_2X2BINNED;
        deviceConfig.camera_fps = K4A_FRAMES_PER_SECOND_30;
        deviceConfig.wired_sync_mode = K4A_WIRED_SYNC_MODE_STANDALONE;
        deviceConfig.synchronized_images_only = true;

        // get calibration
        if (K4A_RESULT_SUCCEEDED !=
            k4a_device_get_calibration(device, deviceConfig.depth_mode, deviceConfig.color_resolution, &calibration))
        {
            cout << "Failed to get calibration" << endl;
            exit(-1);
        }

        // auto calib = calibration.depth_camera_calibration; // 校准深度相机
        auto calib = calibration.color_camera_calibration; // 校准彩色相机

        cout << "\n===== Device " << (int)deviceIndex << ": " << get_serial(device) << " =====\n";
        cout << "resolution width: " << calib.resolution_width << endl;
        cout << "resolution height: " << calib.resolution_height << endl;
        cout << "principal point x: " << calib.intrinsics.parameters.param.cx << endl;
        cout << "principal point y: " << calib.intrinsics.parameters.param.cy << endl;
        cout << "focal length x: " << calib.intrinsics.parameters.param.fx << endl;
        cout << "focal length y: " << calib.intrinsics.parameters.param.fy << endl;
        cout << "radial distortion coefficients:" << endl;
        cout << "k1: " << calib.intrinsics.parameters.param.k1 << endl;
        cout << "k2: " << calib.intrinsics.parameters.param.k2 << endl;
        cout << "k3: " << calib.intrinsics.parameters.param.k3 << endl;
        cout << "k4: " << calib.intrinsics.parameters.param.k4 << endl;
        cout << "k5: " << calib.intrinsics.parameters.param.k5 << endl;
        cout << "k6: " << calib.intrinsics.parameters.param.k6 << endl;
        cout << "center of distortion in Z=1 plane, x: " << calib.intrinsics.parameters.param.codx << endl;
        cout << "center of distortion in Z=1 plane, y: " << calib.intrinsics.parameters.param.cody << endl;
        cout << "tangential distortion coefficient x: " << calib.intrinsics.parameters.param.p1 << endl;
        cout << "tangential distortion coefficient y: " << calib.intrinsics.parameters.param.p2 << endl;
        cout << "metric radius: " << calib.intrinsics.parameters.param.metric_radius << endl;

        k4a_device_close(device);
    }
}

static void calibration_blob(uint8_t deviceIndex = 0, string filename = "calibration.json")
{
    k4a_device_t device = NULL;

    if (K4A_RESULT_SUCCEEDED != k4a_device_open(deviceIndex, &device))
    {
        cout << deviceIndex << ": Failed to open device" << endl;
        exit(-1);
    }

    size_t calibration_size = 0;
    k4a_buffer_result_t buffer_result = k4a_device_get_raw_calibration(device, NULL, &calibration_size);
    if (buffer_result == K4A_BUFFER_RESULT_TOO_SMALL)
    {
        vector<uint8_t> calibration_buffer = vector<uint8_t>(calibration_size);
        buffer_result = k4a_device_get_raw_calibration(device, calibration_buffer.data(), &calibration_size);
        if (buffer_result == K4A_BUFFER_RESULT_SUCCEEDED)
        {
            ofstream file(filename, ofstream::binary);
            file.write(reinterpret_cast<const char*>(&calibration_buffer[0]), (long)calibration_size);
            file.close();
            cout << "Calibration blob for device " << (int)deviceIndex << " (serial no. " << get_serial(device)
                << ") is saved to " << filename << endl;
        }
        else
        {
            cout << "Failed to get calibration blob" << endl;
            exit(-1);
        }
    }
    else
    {
        cout << "Failed to get calibration blob size" << endl;
        exit(-1);
    }
}

static void print_usage()
{
    cout << "Usage: calibration_info [device_id] [output_file]" << endl;
    cout << "Using calibration_info.exe without any command line arguments will display" << endl
        << "calibration info of all connected devices in stdout. If a device_id is given" << endl
        << "(0 for default device), the calibration.json file of that device will be" << endl
        << "saved to the current directory." << endl;
}

int main(int argc, char** argv)
{
    if (argc == 1)
    {
        print_calibration();
    }
    else if (argc == 2)
    {
        calibration_blob((uint8_t)atoi(argv[1]), "calibration.json");
    }
    else if (argc == 3)
    {
        calibration_blob((uint8_t)atoi(argv[1]), argv[2]);
    }
    else
    {
        print_usage();
    }
    cout << "按任意键退出..." << endl;
    cin;
    return 0;
}

The complete code is given above, pay attention to these two lines, it will determine whether to calibrate the depth camera or the color camera.

// auto calib = calibration.depth_camera_calibration; // 校准深度相机
auto calib = calibration.color_camera_calibration; // 校准彩色相机

There are two folders in the Azure Kinect SDK v1.4.1\sdk\windows-desktop directory, namely amd64 and x86. If your system is 64-bit, copy the depthengine_2_0.dll file under amd64\release\bin to VS In the Debug directory; If your system is 32-bit, copy the depthengine_2_0.dll file under x86\release\bin to the Debug directory of VS.

Also note that the depthengine_2_0.dll file and the exe file generated by VS are placed in the same directory, as shown in the figure below:

preview

Control the Kinect camera with Python