[Image evaluation index] PSNR LPIPS LMD SSIM FID

Evaluation index

1. PSNR

PSNR (Peak Signal-to-Noise Ratio) is a method used to measureImage or signal qualityindex of. It is usually used to evaluateThe similarity between an image and the original image, especially in the field of image compression and reconstruction. The higher the value of PSNR, the higher the similarity between the two images and the better the quality .

1.1 Calculation formula

PSNR = 10 ⋅ log ⁡ 10 ( MAX 2 MSE ) \text{PSNR} = 10 \cdot \log_{10}\left(\frac{ {\text{MAX}^2}}{ {\text{MSE}}}\right) PSNR=10log10(MSEMAX2)
where:

  • MAXRepresents the maximum possible value for an image's pixel value (usually 255 for an 8-bit image).
  • MSERepresents the mean squared error (Mean Squared Error). The calculation method is to calculate the difference between the two images pixel by pixel, then square and average it.

1.2 Calculate PSNR code

import cv2
import numpy as np

# 读取原始图像和重建图像
original_image = cv2.imread('original_image.jpg')
reconstructed_image = cv2.imread('reconstructed_image.jpg')

# 计算均方误差(MSE)
mse = np.mean((original_image - reconstructed_image) ** 2)

# 计算PSNR
max_pixel_value = 255  # 对于8位图像,最大像素值为255
psnr = 10 * np.log10((max_pixel_value ** 2) / mse)

print(f"PSNR: {
      
      psnr} dB")

In the above example, assume you have an original image and a reconstructed image, and you use the OpenCV library to read the images. The key to calculating PSNR is to calculate the MSE and then use the PSNR formula to calculate the PSNR value. Typically, PSNR is expressed in decibels (dB), with higher values ​​indicating better image quality.

2. LPIPS

LPIPS, the full name of Learned Perceptual Image Patch Similarity, is a method used to measurePerceptual similarity between two imagesindex of. Different from traditional PSNR and SSIM, LPIPS is learned through deep learning methods and can better simulate human visual perception.

2.1 LPIPS calculation formula

The calculation formula of LPIPS is not a simple mathematical formula, but is implemented through a deep neural network. Typically, LPIPS models use two images as input and then output a perceptual similarity score between them. The specific architecture and parameters of the LPIPS model are obtained through large-scale training and are used to capture the perceptual information of the image.

2.2 Calculate LPIPS code

import torch
import torchvision.transforms as transforms
import torchvision.models as models
from PIL import Image

# 加载预训练的LPIPS模型
lpips_model = models.lpips.LPIPS(net='alex')

# 加载两幅图像
image1 = Image.open('image1.jpg').convert('RGB')
image2 = Image.open('image2.jpg').convert('RGB')

# 对图像进行预处理
preprocess = transforms.Compose([
    transforms.Resize((256, 256)),
    transforms.ToTensor(),
])

image1 = preprocess(image1).unsqueeze(0)
image2 = preprocess(image2).unsqueeze(0)

# 使用LPIPS模型计算相似性
similarity_score = lpips_model(image1, image2)

print(f"LPIPS Similarity: {
      
      similarity_score.item()}")

First a pre-trained LPIPS model (the model used here alex) is loaded. The two images were then loaded and the necessary preprocessing was performed on them to match them to the input format of the LPIPS model. Finally, we use the LPIPS model to calculate the perceptual similarity score between the two images.

3. LMD

LMD (Landmark Distance) is an indicator used to evaluate the quality of facial image generation. It is used to measure the distance of facial feature points between the generated facial image and the real facial image to evaluate the accuracy of the facial features of the generated image. LMD is commonly used in face synthesis and generation tasks to measure the similarity between generated facial images and real facial images.

The general steps for calculating LMD are as follows:

  1. Detect the locations of facial feature points (e.g., eyes, nose, mouth, etc.) from generated facial images and real facial images.
  2. Calculate the Euclidean distance between each feature point.
  3. Sum up or average these distances to get the final value of LMD.

3.1 Calculation method of LMD

The following is the calculation formula of LMD (assuming there are N feature points):

L M D = 1 N ∑ i = 1 N ( x g i − x r i ) 2 + ( y g i − y r i ) 2 LMD = \frac{1}{N} \sum_{i=1}^{N} \sqrt{(x_{g_i} - x_{r_i})^2 + (y_{g_i} - y_{r_i})^2} LMD=N1i=1N(xgixri)2+(ygiyri)2

in:

  • L M D LMD L M D is the value of Landmark Distance.
  • N N N is the number of feature points.
  • x g i x_{g_i} xgi y g i y_{g_i} ygiare respectively the x and y coordinates of the i-th feature point on the generated image.
  • x r i x_{r_i} xri y r i y_{r_i} yriare respectively the x and y coordinates of the i-th feature point on the real image.

3.2 Calculate LMD code

The following is a simple example code for calculating LMD in Python, assuming there are two images and a list of their feature point coordinates:

import numpy as np

def calculate_lmd(landmarks_generated, landmarks_real):
    num_landmarks = len(landmarks_generated)
    lmd = 0.0
    
    for i in range(num_landmarks):
        x_g, y_g = landmarks_generated[i]
        x_r, y_r = landmarks_real[i]
        distance = np.sqrt((x_g - x_r)**2 + (y_g - y_r)**2)
        lmd += distance
    
    lmd /= num_landmarks  # 平均距离
    return lmd

# 示例用法
landmarks_generated = [(10, 20), (30, 40), (50, 60)]  # 生成图像上的特征点坐标
landmarks_real = [(15, 25), (35, 45), (55, 65)]  # 真实图像上的特征点坐标

lmd_score = calculate_lmd(landmarks_generated, landmarks_real)
print("LMD Score:", lmd_score)

4. YES

Structural Similarity Index (SSIM) is a method used toA measure of the similarity between two images, it not only considersbrightness, contrast, structural information is also considered. SSIM is commonly used in areas such as image quality assessment, image compression, and image enhancement.

4.1 Calculation formula of SSIM

SSIM ( x , y ) = ( 2 μ x μ y + C 1 ) ( 2 σ xy + C 2 ) ( μ x 2 + μ y 2 + C 1 ) ( σ x 2 + σ y 2 + C 2 )\ text{SSIM}(x,y) = \frac{ {(2\mu_x\mu_y + C_1)(2\sigma_{xy} + C_2)}}{ { (\mu_x^2 + \mu_y^2 + C_1) (\sigma_x^2 + \sigma_y^2 + C_2)}}SSIM ( x ,y)=( mx2+my2+C1) ( px2+py2+C2)( 2 mxmy+C1) ( 2 pxy+C2)

in:

  • x x xyyy are the two images to be compared.
  • μ x \mu_xmxμ y \mu_ymyrespectively image xxxyyThe mean brightness of y .
  • σ x 2 \sigma_x^2 px2σ y 2 \sigma_y^2py2respectively image xxxyyThe brightness variance of y .
  • σ x y \sigma_{xy} pxyis image xxxyyThe brightness covariance between y .
  • C 1 C_1 C1and C 2 C_2C2is a constant used for stable calculations, usually set to a small positive value.

The value range of SSIM is usually between -1 and 1. The closer the value is to 1, the more similar the two images are and the higher the quality.

4.1 Code for calculating SSIM

from skimage.metrics import structural_similarity as ssim
from skimage import io

# 读取两幅图像
image1 = io.imread('image1.jpg', as_gray=True)
image2 = io.imread('image2.jpg', as_gray=True)

# 计算SSIM
ssim_score = ssim(image1, image2)

print(f"SSIM: {
      
      ssim_score}")

There are two picturesGrayscale image to compare (color images can also be used), and then use the functions of the Scikit-Image library structural_similarityto calculate the SSIM between them. The result will be a floating point number ranging from -1 to 1, with higher values ​​indicating more similarity between the two images.

5. FID

FID (Fréchet Inception Distance) is a metric used to evaluate the performance of generative models, especially widely used in Generative Adversarial Networks (GANs). It aims to measure the difference between the generated image and the real image distribution, i.e.Quality and diversity of generated images. A lower FID value indicates that the generated image is closer to the real image distribution and is therefore generally considered a better generative model.

5.1 FID calculation formula

The calculation of FID is based on the Fréchet distance in the feature vector space between two image distributions. The following is the calculation formula of FID:
FID ( P , G ) = ∣ ∣ μ P − μ G ∣ ∣ 2 + T r ( Σ P + Σ G − 2 ∗ Σ P Σ G ) FID(P, G) = || μ_P - μ_G||^2 + Tr(Σ_P + Σ_G - 2*\sqrt{Σ_PΣ_G})FID(P,G)=∣∣μPmG2+T r ( SP+SG2SPSG )
where:

  • P P P represents the set of feature vectors of the real image distribution, which is usually represented by the output of the middle layer of the Inception network.
  • GGG represents the set of feature vectors that generate the image distribution, and is also expressed in the same way.
  • μ P μ_PmPμ G μ_GmGThey are PPP andGGThe mean of the set of eigenvectors of G.
  • Σ P Σ_PSPΣ G Σ_GSGThey are PPP andGGThe covariance matrix of the set of eigenvectors of G.
  • T r ( Σ P + Σ G − 2 ∗ Σ P Σ G ) Tr(Σ_P + Σ_G - 2*\sqrt{Σ_PΣ_G})T r ( SP+SG2SPSG ) represents the square root of the trace of the covariance matrix.

5.2 Code to calculate FID

To calculate FID, you need to first extract the feature vectors of the real image and the generated image, and then calculate the above formula. Typically, feature vectors are extracted by forward propagation on the Inception network. Here is example code for calculating FID using Python and NumPy:

import numpy as np
from scipy.linalg import sqrtm

# 两个示例特征向量集合的均值和协方差矩阵
mu_P = np.array([0.5, 0.5])
sigma_P = np.array([[1.0, 0.5], [0.5, 1.0]])

mu_G = np.array([0.8, 0.7])
sigma_G = np.array([[1.2, 0.4], [0.4, 1.3]])

# 计算FID
diff = mu_P - mu_G
covar_trace = np.trace(sigma_P + sigma_G - 2.0 * sqrtm(sigma_P @ sigma_G))
fid = np.dot(diff, diff) + covar_trace

print(f"FID Value: {
      
      fid}")

The means and covariance matrices of two example feature vector sets are provided, and then the FID values ​​between them are calculated. Please note that in practical applications, the feature vectors of real images and generated images need to be used for calculation, and these feature vectors can usually be extracted using pre-trained models in deep learning frameworks (such as PyTorch or TensorFlow).

First published at: https://zhuanlan.zhihu.com/p/658827245

Guess you like

Origin blog.csdn.net/qq_44824148/article/details/133353143