Use the infrared-visible light image data set OTCBVS to connect image fusion, target detection and target tracking


Preface

This article records that we started by installing cuda and cudnn on the cloud server autodl, and deployed the infrared and visible light image data set OTCBVS taken at the same perspective, at the same time, and at the same place on Github. The current open source image fusion PIAFusion, target detection Yolo-v4, and target tracking DeepSort Algorithmically achieve single data set penetration.

This article only does the following:
1. List common infrared-visible light image data sets and their characteristics.
2. Only process experience of deployment and implementation on cloud servers is provided.

This article does not provide the following points:
1. It does not do any optimization of the algorithm, nor does it discuss the final image results. If necessary, you can add innovative points to play with it.
2. No indicators will be discussed.
3. No research on other open source image fusion, target detection, and target tracking algorithms.

This article is relatively basic, just because I have researched the technical implementation that rarely shares the entire process on CSDN, and Bilibili only provides the results after implementation. Many up owners do not disclose the source code for the sake of one-click triple connection. The author provides the entire technical implementation reference on the public source code.


1. Task Overview

UAVs often use multi-camera shooting to obtain target features. This is because the pictures taken by infrared cameras and visible light cameras have their own feature display advantages. The infrared camera uses the temperature difference between the target and the background to allow the target to be marked significantly. It also has certain texture characteristics and can better observe targets in the shadows during the day and targets at night; while the pictures taken by the visible light camera Due to the prominent edge information, it has more texture features, but it does not have good information expression for characters in shadows and dark nights. Multi-modal image fusion extracts salient features of infrared and texture features of visible light, fuses infrared features and visible light features according to flexible feature fusion rules, and finally uses image reconstruction to reconstruct the fused features into pictures. UAVs use target recognition and positioning to process and analyze more information-rich fusion images to obtain the location and category of the target. Then, as needed, mark the target of interest for target tracking (target tracking is to locate and identify the target in two adjacent frames of pictures, and can predict the target's next movement, which is an intermediate computer vision task). The position of the target of interest obtained through target tracking is fed back to the UAV visual servo module, which adjusts the attitude and speed of the UAV body to directionally track the target.

The drone shoots through an infrared camera and a visible light camera. It is impossible for the two cameras to shoot from the same vision. This involves image registration. Since no available public data set has been found, the task of image registration is not considered for the time being. The visual servo module is a control module and does not fall into the category we want to solve, so it is not considered.


2. Common infrared-visible light image data sets

2.1 OTCBVS

contains 16 subcategories of infrared and visible light image data sets (this time using part of the 2nd subcategory of OTCBVS), the information of the 2nd subcategory: the total size is 1.83 GB, the image size is 320 x 240 Pixels (Visible and Thermal), 4228 pairs of thermal and visible images.
Link: http://vcipl-okstate.org/pbvs/bench/

2.2 TNO image fusion dataset

contains multi-spectral images related to military night scenes, such as near-infrared, far-infrared, and visible light. The image size is 768 × 576 pixels.
Link: https://figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029

2.3 INO image fusion dataset

contains color and infrared images under various weather and night conditions. The image size is 328×254 pixels.
Link: http://www.ino.ca/en/video-analytics-dataset/

2.4 Eden Project Multi-Sensor Dataset

The image size is 560×468 pixels
Link: http://www.cis.rit.edu/pelz/scanpaths/data/bristol-eden.htm


3. Image fusion

3.1 Open source image fusion algorithm PIAFusion

PIAFusion is an image fusion algorithm published in the top journal "Information Fusion" in the fusion field in 2022
Link: https://github.com/Linfeng-Tang/PIAFusion
Article: "PIAFusion: A progressive infrared and visible image fusion network based on illumination aware"

3.2 Installation environment and related package versions

Considering the convenience of cloud server configuration, we use the affordable cloud server autodl on the market to configure image fusion. In the later stage, you can consider mirroring the configured environment to facilitate testing on NVIDIA graphics cards with different computing power.
The versions of cuda and cuDNN are as follows:

cuDNN7.4
cuda10.0

Installed versions of other libraries

tensorflow-gpu==1.14.0
opencv-python==3.4.2.17
scipy==1.2.0
numpy==1.19.2
pandas==1.1.5
openpyxl==3.0.10
protobuf==3.19.0

The basis for selecting versions of cuda, cuDNN and other libraries can be found in the following pictures
Insert image description here
Insert image description here
Insert image description here
However, the autodl official website does not directly provide the above cuda and cuDNN version images, so we need to install them ourselves. We first select the RTX2080 card and the images of TensorFlow1.15.5, Python3.8, and cuda11.4.

3.2.1 Check (install) NVIDIA driver

nvidia-smi

Get as follows

Sun Jul 31 15:04:41 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.44       Driver Version: 495.44       CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:1A:00.0 Off |                  N/A |
| 54%   47C    P8     5W / 250W |      0MiB / 11019MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

We only need to focus on the driver version 495.44, which supports the highest version of CUDA up to 11.5. If the information is output, it means that the NVIDIA driver has been installed. If you need to install the driver on your own laptop, please find the driver installation tutorial.

3.2.2 Check (install) cuda and cuDNN versions

Check cuda usage

nvcc -V

output

Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

Verify that the cuda version is 10.0

Check cuDNN usage

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A2

output

#define CUDNN_MAJOR 7
#define CUDNN_MINOR 4
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

Verify that the cuDNN version is 7.4.1
This is the result after my installation. If you need to install it, please refer to the following steps

3.2.2.1 Download cuda from the official website

Link: https://developer.nvidia.com/cuda-toolkit-archive

下载.run格式的安装包后:
chmod 777 cuda_10.0.130_410.48_linux.run   # 修改权限使其可运行
./cuda_10.0.130_410.48_linux.run  --override        # 运行安装包

Add to environment variables

echo "export PATH=/usr/local/cuda-10.0/bin:${PATH} \n" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:${LD_LIBRARY_PATH} \n" >> ~/.bashrc

Use the following to make environment variables effective

source ~/.bashrc
3.2.2.2 Confirm whether cuda is installed successfully

After restarting, use

nvcc -V

If the cuda version number is output successfully, it indicates that the installation has been successful.
If you need to additionally test the cuda function, you can

cd /usr/local/cuda-10.0/samples/1_Utilities/deviceQuery
make
./deviceQuery

output

......
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.5, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS
3.2.2.3 Download cuDNN from the official website

Link: https://developer.nvidia.com/cudnn

mv cuda/include/* /usr/local/cuda/include/
chmod +x cuda/lib64/* && mv cuda/lib64/* /usr/local/cuda/lib64/

Use the following to make it effective

ldconfig
3.2.2.4 Verify whether the GPU can be utilized
import tensorflow as tf
print(tf.test.is_gpu_available())

output

......
2022-07-31 15:26:53.092539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 10322 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:1a:00.0, compute capability: 7.5)
True

Indicates that the GPU can be called

3.2.3 Installation environment

It is recommended to use the version I provided above to update the requirements.txt provided by the source code.

pip install -r requirements.txt

3.3 Download the data set and modify the image names in batches

The data set selects the second sub-data set of OTCBVS. I listed the first infrared-visible picture group of this data set. You can intuitively see the difference between infrared pictures and visible light pictures.
Insert image description here
Insert image description here

The image fusion PIAFusion source code has requirements for image names. You can use the following to modify the file names in batches.

import os

dirpath_top = os.path.dirname(os.path.abspath(__file__))
filelist_name = "OCTBVS0312"
filelist_se1 = "ir"
filelist_se2 = "vi"
otcbvs_path1 =os.path.join(dirpath_top, filelist_name, filelist_se1)
otcbvs_path2 =os.path.join(dirpath_top, filelist_name, filelist_se2)
#print(otcbvs_path1)
otcbvsir = os.listdir(otcbvs_path1)
otcbvsvi = os.listdir(otcbvs_path2)
#print(otcbvsir)

i=0
for item in otcbvsir:
    if item.endswith('.bmp'):
        src = os.path.join(os.path.abspath(otcbvs_path1), item)        #原本的名称  
        dst = os.path.join(os.path.abspath(otcbvs_path1), str(i).zfill(4) + '.bmp')                #这里我把格式统一改成了 .jpg
    try:
        os.rename(src, dst)          #意思是将 src 替换为 dst
        i+=1
        print('rename from %s to %s' % (src, dst))
    except:
        continue
print('ending...')

i=0
for item in otcbvsvi:
    if item.endswith('.bmp'):
        src = os.path.join(os.path.abspath(otcbvs_path2), item)    
        dst = os.path.join(os.path.abspath(otcbvs_path2), str(i).zfill(4) + '.bmp')    
    try:
        os.rename(src, dst)         
        i+=1
        print('rename from %s to %s' % (src, dst))
    except:
        continue
print('ending...')

3.4 Run PIAFusion

For convenience, you can change the OTCBVS name to TNO, put it in the test data, and run

python main.py --is_train=False model_type=PIAFusion --DataSet=TNO

You can also add argparse items in related files. Lazy people can just use the method above!
You can fuse 600 pairs of infrared-visible light pictures. The picture below is the fused picture
Insert image description here

3.5 Fusion of pictures into video

import numpy as np
import cv2
import os
size = (320,240)
print("每张图片的大小为({},{})".format(size[0],size[1]))
dirpath_top = os.path.dirname(os.path.abspath(__file__))
src_path = os.path.join(dirpath_top,"OTCBVS")
sav_path = os.path.join(dirpath_top,"OTCBVS.mp4")
all_files = os.listdir(src_path)
index = len(all_files)
print("图片总数为:" + str(index) + "张")
fourcc = cv2.VideoWriter_fourcc(*'mp4v')#MP4格式

videowrite = cv2.VideoWriter(sav_path,fourcc,20,size)#2是每秒的帧数,size是图片尺寸
img_array=[]
 
tmp_name_list = []
for i in range(0,index):
    tmp_i=str(i).zfill(4)
    tmp_name_list.append(os.path.join(src_path,r'{0}.bmp'.format(tmp_i)))
    
#for filename in [src_path + r'{0}.bmp'.format(i) for i in tmp_name]:
for filename in tmp_name_list:
    img = cv2.imread(filename)
    if img is None:
        print(filename + " is error!")
        continue
    img_array.append(img)

for i in range(0,index):
    img_array[i] = cv2.resize(img_array[i],(320,240))
    videowrite.write(img_array[i])
    print('第{}张图片合成成功'.format(i))
print('------done!!!-------')

video effects

OTCBVS-fusedresults


4. Target detection and target tracking

4.1 Open source target detection and tracking algorithm Yolo-v4&DeepSort

Yolo-v4
Link: https://github.com/AlexeyAB/darknet
DeepSort
Link :https://github.com/ZQPei/deep_sort_pytorch
We use packaged Yolo-v4 and DeepSort
https://github.com/TsMask/ deep-sort-yolov4

4.2 Use the video obtained after image fusion as input for target detection and tracking

run

python detect_video_tracker.py --video OTCBVS.mp4 --min_score 0.3 --model_yolo model_data/yolov4.h5 --model_feature model_data/mars-small128.pb

video effects

OTCBVS-D+TVideo


5. Summary

This tutorial roughly provides the technical implementation of image fusion, target detection, and target tracking on cloud server autodl. Subsequently, the author should optimize the algorithm according to the requirements of the group and deploy it to the development board for lightweight implementation. Welcome people of the same profession to communicate together!

Guess you like

Origin blog.csdn.net/weixin_41029027/article/details/126085199