Tensor Distance based Multilinear Multidimensional Scaling for Image and Video Analysis

Tensor Distance based Multilinear Multidimensional Scaling for Image and Video
Analysis

Summary

A tensor distance-based multi-linear multi-dimensional scaling dimensionality reduction technique (TD-MMDS) is proposed. First, we propose a new distance metric, called tensor distance (TD), for building relational graphs of high-order data points. We then employ an iterative strategy to sequentially learn the transformation matrix to best preserve pairwise TDs for high-order data in the low-dimensional embedding space. TD-MMDS provides a unified tensor-based dimensionality reduction framework by combining tensor distance and tensor embedding, which preserves the inherent structure of high-order data throughout the learning process. Experiments on standard image and video datasets verify the effectiveness of TD-MMDS.

Introduction

Image and video analysis often suffers from the huge dimensionality of feature spaces. Dimensionality reduction techniques address this problem by generating a low-dimensional equivalent of the original feature space for a given object. However, traditional dimensionality reduction algorithms usually unroll the input data into vector form, even though the data are naturally high-order tensors. This vectorization greatly increases the computational load of data analysis and severely disrupts the inherent tensor structure of high-order data.
In order to solve the problems caused by vectorization, some tensor-based dimensionality reduction techniques have been proposed, such as tensor principal component analysis (TPCA) and tensor linear discriminant analysis (TLDA). These algorithms aim to preserve the relationship between higher-order data, which is usually measured using the distance between data points, by sequentially learning transformation matrices. Therefore, the performance of tensor-based dimensionality reduction not only depends on the embedding strategy, but also is closely related to the distance metric. Currently, tensor-based techniques just utilize the traditional Euclidean distance as a distance metric. In Euclidean space, higher-order data X in the space R I1xI2…In. can be represented by coordinates X1,X2,…,XI1xI2x…In with corresponding basis e1,e2,…,eI1xI2x…In, where < ei , ej>=0 (i≠j). This means that any two bases ei, ej are assumed to be perpendicular to each other, so any coordinate xi is independent of each other. However, this orthogonality assumption ignores the relationship between different coordinates of higher-order data, such as the spatial relationship of pixels in an image, thus limiting the performance of further tensor embeddings.
In order to release the orthogonality assumption of current distance measures, we propose a new tensor distance (TD) to measure the relationship between high-order data, especially images and videos, which have a strong relationship between different coordinates. relevance. Then, we extend a typical distance-preserving based dimensionality reduction algorithm Multidimensional Scaling (Multidimensional Scaling, MDS) [9] to its multilinear version (Multilinear Multidimensional Scaling, MMDS). Combining TD and MMDS, we propose a novel tensor distance-based multilinear multidimensional scaling dimensionality reduction algorithm (TD-MMDS), which preserves the inherent structure of high-order data throughout the learning process.

tensor distance

For some types of higher-order data, the traditional Euclidean distance may not reflect the actual distance between two data points due to the assumption of orthogonality discussed earlier. In this section, we propose a new distance metric called TD for modeling the correlation between different coordinates of arbitrary order data.
Given a data point insert image description here
we denote by x the vector form representation of X, thus, an element in X
insert image description here
corresponds to xl, the l-th element in x, where l=i1+
insert image description here
then the TD between two tensors X and Y Can be expressed as
insert image description here
where glm is the metric coefficient and G is the metric matrix. In order to reflect the intrinsic relationship between different coordinates of high-order data, a natural consideration is that the metric coefficient should be related to the element distance. Wang et al. have demonstrated that for image data, i.e., second-order tensors, the obtained distance metric can effectively reflect the spatial relationship between pixels if the metric coefficients properly depend on the distance of pixel locations. Inspired by this paper, we design the following metric matrix G:
insert image description here
where σ1 is the regularization parameter, ||pl − pm||2 is the difference between Xi1i2...in (corresponding to xl) and Xi1'i2'...in' (corresponding to xm). The position distance between , defined as
insert image description here
dTD can then be rewritten as
insert image description here
In fact, the Euclidean distance can be seen as a special case of the proposed TD. If we make the metric matrix the identity matrix, i.e. G=i, which means we only consider the distance between the corresponding coordinates of two higher-order data in tensor space, then TD will reduce to Euclidean distance.
Since G is a real symmetric positive definite matrix, we can easily decompose it as follows:
insert image description herewhere G (1/2) is also a real symmetric matrix defined as
insert image description here
Here, AG is a diagonal matrix whose elements are the eigenvalues ​​of G , UG is an orthogonal matrix whose column vectors are the eigenvectors of G. Apply the transformation G 1/2 to the vector representation x and y,i. ex'=G 1/2 x,y'=G1/2 y, then reduces the TD between x and y to the traditional Euclidean distance between x' and y'
insert image description here
Therefore, it is easy to embed TD into the general learning process, we only need to perform the transformation on the original data G 1/2 , then use the transformed data in

Multilinear multidimensional scaling based on tensor distance

Multilinear multidimensional scaling based on tensor distance

Given n data points X1, X2, ..., Xn in the tensor space R I1xI2x...xIn, without expanding the input data into vector form, Tensor Distance-Based Multilinear Multidimensional Scaling (TD-MMDS) aims to find N A transformation matrix Vk∈R IkxIk' (Ik'<<Ik,k=1,…,N), so that the subspace can be obtained by multi-linear transformation Yj=Xjx1V T 1x2…xnV T N(j= 1,…, N) n low-dimensional data points Y1, Y2, ..., Yn in R I1'xI2'x...xIn'
. According to the graph embedding framework, the objective function of TD-MMDS can be expressed as:
insert image description here
where, t(DG)=–HSH /2(Sij=d 2 TD(Xi,Xj), where dTD(Xi,Xj) represents the tensor distance between two data points Xi and Xj, H=i-(ee T /n), where I is identity matrix, e is a vector of all data points). The objective function (8) can be rewritten as:
insert image description here
As far as we know, this higher-order nonlinear programming problem has no closed-form solution. Instead of solving this problem directly, we employ an iterative strategy [2], [5], [6] to find a locally optimal solution. Before discussing the iterative strategy, we give the following theorem:
**Theorem 1: **Assume V1, V2,..., Vk-1, Vk+1,..., VN are fixed, and minimize the objective function J(V1,... , VN) the optimal Vk consists of the first Ik′ eigenvector of the following matrix:
insert image description here
corresponding to the first Ik′ largest eigenvalue. Xki is the k-mode expansion of tensor Xki
insert image description here
insert image description here
The above theorems can be easily proved by using the properties of multilinear algebra and matrix trace. Similar proofs can be found in [5] and [6]. According to Theorem 1, if V1, V2, ..., Vk−1, Vk+1, ..., VN are fixed, the optimal Vk can be obtained by a simple eigendecomposition position. Therefore, the iteration strategy can be as follows. First, we fix V2,...,VN, and obtain V1 by minimizing the objective function J(V1). Then we fix V1, V3, ..., VN, and obtain the optimal V2 by minimizing the objective function J(V2). The rest can be deduced by analogy. Finally, V1, V2, ..., VN-1 are determined, and the optimal VN is obtained by minimizing the objective function J(VN). Repeat the above steps until the termination condition is met, then we get a solution. The detailed process of TD-MMDS is described in Algorithm 1.
insert image description here

Guess you like

Origin blog.csdn.net/weixin_45184581/article/details/125080277