Pytorch机器学习(七)——YOLOV5图像增广之仿射变换与透射变换

Pytorch机器学习(七)——YOLOV5图像增广之仿射变换与透射变换


前言

YOLOV5的图像增强技术中有十分多可以学习的地方,这里记录一下其中的仿射和透射变换。


一、平移变换

平移变换矩阵如下

 代码如下

T = np.eye(3)
# 这里注意一下,图像的左上角是0,0,a为正表示向左,负为右,b为正表示向下,负为上
T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width  # x translation (pixels)
T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height  # y translation (pixels)
# 进行仿射变换,并且用灰边填充
im = cv2.warpAffine(im, T[:2], dsize=(width, height), borderValue=(114, 114, 114))

二、旋转变换

旋转变换的变换矩阵如下

 代码如下

R = np.eye(3)
# 得到在-degree到degree中的一个实数
a = random.uniform(-degrees, degrees)
# s图像的缩放因子
s = random.uniform(1 - scale, 1 + scale)
# 使用getRotationMatrix2D得到旋转矩阵
R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)
# 进行仿射变换,并且用灰边填充
im = cv2.warpAffine(im, R[:2], dsize=(width, height), borderValue=(114, 114, 114))

 三、shear错切变换

shear变换的变换矩阵如下

 代码如下

# Shear
S = np.eye(3)
S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # x shear (deg)
S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # y shear (deg)
# 进行仿射变换,并且用灰边填充
im = cv2.warpAffine(im, S[:2], dsize=(width, height), borderValue=(114, 114, 114))

 

、透射变换

仿射变换就是特殊的透射变换,所以在yolov5中一般都是直接采用的仿射变换,而不是直接采用投射变换。

# Perspective
P = np.eye(3)
P[2, 0] = random.uniform(-perspective, perspective)  # x perspective (about y)
P[2, 1] = random.uniform(-perspective, perspective)  # y perspective (about x)
# 进行投射变换
im = cv2.warpPerspective(im, P, dsize=(width, height), borderValue=(114, 114, 114))

 可以看到透射变换是基于之前几种变换实现的一种变换

 

二、代码总和

def random_perspective(im, targets=(), segments=(), degrees=10, translate=.1, scale=.1, shear=10,
                       border=(0, 0)):
    # torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(.1, .1), scale=(.9, 1.1), shear=(-10, 10))
    # targets = [cls, xyxy]

    height = im.shape[0] + border[0] * 2  # shape(h,w,c)
    width = im.shape[1] + border[1] * 2

    # Center
    # YOLOV5中先对图像进行平移变换到左上角
    C = np.eye(3)
    C[0, 2] = -im.shape[1] / 2  # x translation (pixels)
    C[1, 2] = -im.shape[0] / 2  # y translation (pixels)

    # Rotation and Scale
    R = np.eye(3)
    a = random.uniform(-degrees, degrees)
    # s图像的缩放因子
    s = random.uniform(1 - scale, 1 + scale)
    R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)

    # Shear
    S = np.eye(3)
    S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # x shear (deg)
    S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180)  # y shear (deg)

    # Translation
    T = np.eye(3)
    T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width  # x translation (pixels)
    T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height  # y translation (pixels)

    # Combined rotation matrix
    M = T @ S @ R @ C  # order of operations (right to left) is IMPORTANT
    if (border[0] != 0) or (border[1] != 0) or (M != np.eye(3)).any():  # image change
    # 默认是使用仿射变换
    im = cv2.warpAffine(im, M[:2], dsize=(width, height), borderValue=(114, 114, 114))

    return im

おすすめ

転載: blog.csdn.net/lzzzzzzm/article/details/120081966
おすすめ