AI- image basics -02

table of Contents

Image coordinate system
Digital image

Image coordinate system

How about in front of the data label articles marked, and the label will retain four coordinate points, then these coordinate points how to represent the position in the picture? To indicate the position of a point or graphics, you need to relate to the coordinate system concept. Today to find out the coordinate system of the image. Generally we come into contact with the first coordinate system should be a Cartesian coordinate system, as shown below:

01 Cartesian coordinate system .png

As shown below, it is a Cartesian coordinate system in the image pixels of the upper left corner as the origin established UV . Its abscissa and ordinate v u are the number of rows and number of columns which is located in the image array.

02 .png image coordinate system

OpenCV the above coordinates, u corresponding to x, v y corresponding to

Since (u, v) the number of columns and rows represent the number of pixels, the pixel position in the image is not shown and physical units, it should be established an image coordinate system in physical units (e.g. mm) represented by xy. The intersection of the optical axis and the image plane of the camera (typically located at the center of the image plane, also called the principal point of the image (principal point) is defined as the origin of the coordinate system O1, and the u axis is parallel to the x-axis, y-axis and v parallel to the axis, is assumed (u0, v0) on behalf of O1 coordinate, dx and dy in the uv coordinate system respectively physical dimensions of each pixel on the horizontal axis x and vertical axis y, the image of each pixel in the uv coordinates there is the following relationship between the coordinate system and a coordinate in the xy coordinate system:

03 coordinate system conversion .png

Units of the above formula we assume a physical coordinate system is mm, the unit of dx mm / px, and x / dx unit is px.

For ease of use, commonly used homogeneous coordinate matrix form as:

04 indicates .png form of a matrix

More knowledge may be more difficult, it may be truly useful for our knowledge is as follows:

FIG .png image coordinate example 05

Then the relationship between the coordinate system, ranks corresponding to the width and height are as follows:

row=height=y
col=width=x

Acquiring an image size using OpenCV

With the example above diagram, for example, the size of the original picture:

06 sample image sizes .png

The sample code is as follows:

import cv2
import numpy as np
import os

def GetImgFile(path,imgExtName=(".png",".bmp",".jpg",".jpeg",".gif")):
    imgFileList=[ imgFile for r,s,fs in os.walk(imgPath) for imgFile in fs
                  if os.path.isfile(os.path.join(r,imgFile)) and 
        os.path.splitext(os.path.join(r,imgFile))[-1].lower() in imgExtName ]
    return imgFileList

def GetImgHeightAndWidth(path,imgList):
    tempDict={}
    for item in imgList:
        imgFullPath=path+"\\"+item
        img=cv2.imdecode(np.fromfile(imgFullPath,dtype=np.uint8),cv2.IMREAD_COLOR)
        imgHeight,imgWidth,_=img.shape
        tempDict[item]={"Height":imgHeight,"Width":imgWidth}
    return tempDict

if __name__ == "__main__":
   imgPath=r"F:\编程资料\编程工程\AI学习笔记\03图像知识\测试图片" 
   imgShape=GetImgHeightAndWidth(imgPath,GetImgFile(imgPath))
   print(imgShape)

The output is shown below:

{'TestImage.jpg': {'Height': 604, 'Width': 403}}

Digital image

图像数字化简单来讲，就是如何将图像保存为计算机能够识别和还原的对象。数字化后的图像其本质上就是一个多维矩阵，例如常见的RGB图像其实可以理解为3个二维矩阵的叠加，矩阵中每个值对应颜色通道上的值（0~255），灰度图则是1个二维矩阵。如下所示：

07BGR channel information .png

如上图所示，该图片大小为604*403，因此有3个604*403的矩阵。

在CV领域，矩阵的概念用得非常多，下面简单介绍一下相关的概念，不做深究。

矩阵

在数学概念中，矩阵（Matrix）是一个按照阵列形式排列的实数或复数的集合。如下图所示：

08 matrix concept .png

这是一个(m+1)*(n+1)知矩阵，行列索引从0开始

使用代码创建矩阵

在Python中常用于numpy模块创建和处理矩阵，示例代码如下所示：

import numpy as np

mat=np.array(range(10,35)).reshape(5,5)
print(mat)

输出的二维矩阵如下所示，从编程的角度来理解，就是一个二维的数组。

[[10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]
 [25 26 27 28 29]
 [30 31 32 33 34]]

矩阵与图像

从前面已经可以大致猜到矩阵与图像的关系了。既然图像可以用多个矩阵来表示，那也就是意味着，我们可以自己通过代码来创建图像，示例如下所示：

import cv2
import numpy as np

mat=np.array([
  [[255,0,0],[0,255,0],[0,0,255]],
  [[123,145,239],[10,100,134],[0,235,252]],
  [[23,45,12],[56,12,78],[128,150,12]]
],dtype=np.uint8)

cv2.namedWindow("Create Img",cv2.WINDOW_NORMAL)
cv2.imshow("Create Img",mat)
cv2.waitKey()

生成的图片效果片如下所示：

在上面3*3矩阵中，mat[m][n]分别代表BGR的值，如下图所示：

10 a schematic view of color matrix .png

通过上面的示例，大家应该了解到图片是如何用矩阵进行表示存储的。平常大家看到的彩色图片也都采用这种方式组成，图片越大，则矩阵规模也越大。

通道概念

通道就是每个拥有的色彩维度

1、对于灰度图像，只有一个色彩维度，因此是单通道
2、对于RGB彩色图像，有RGB三个色彩维度，因此是3通道
3、对于RGBA彩色图像，有RGBA（A：alpha透明度）四个色彩维度，因此是4通道

大部分图像都可以用3维矩阵来表示

单纯从代码角度来讲，1维矩阵就是普通数组，3维矩阵就是3维数组，多维矩阵就是多维数组

使用OpenCV读取RGB通道图片

示例代码如下所示：


import cv2
import numpy as np
import os

def GetImgFile(path,imgExtName=(".png",".bmp",".jpg",".jpeg",".gif")):
    imgFileList=[ imgFile for r,s,fs in os.walk(imgPath) for imgFile in fs
                  if os.path.isfile(os.path.join(r,imgFile)) and 
        os.path.splitext(os.path.join(r,imgFile))[-1].lower() in imgExtName ]
    return imgFileList

def GetBGRInfo(path,imgList):
    for item in imgList:
        imgFullPath=path+"\\"+item
        img=cv2.imdecode(np.fromfile(imgFullPath,dtype=np.uint8),cv2.IMREAD_COLOR)
        ShowImg(img,winName="Source IMG")
        # 分享BGR的通道信息
        b,g,r=cv2.split(img)
        # 创建与img相同大小的零矩阵
        zerosArray=np.zeros(img.shape[:2],dtype="uint8")
        # 显示（B,0,0）图像
        ShowImg(cv2.merge([b,zerosArray,zerosArray]),"Blue Channel")
        # 显示（0,G,0）图像
        ShowImg(cv2.merge([zerosArray,g,zerosArray]),"Green Channel")
        # 显示（0,0,R）图像
        ShowImg(cv2.merge([zerosArray,zerosArray,r]),"Red Channel")
        # 显示代码合成的BGR图像
        ShowImg(cv2.merge([b,g,r]),"Merge Img-BGR")
        # 显示代码合成的RGB图像
        ShowImg(cv2.merge([r, g, b]), "Merge Img-RGB")

def ShowImg(obj,winName="ImgShow"):
    cv2.imshow(winName,obj)
    if cv2.waitKey(0) == ord("q") or cv2.waitKey(0) == ord("Q"):
        cv2.destroyAllWindows()
        
if __name__ == "__main__":
   imgPath=r"F:\测试图片"
   imgFileList=GetImgFile(imgPath)
   GetBGRInfo(imgPath,imgFileList)

显示BGR单独通道信息的效果图如下所示：

显示原始图片、BGR合成图片、RGB合成的图片效果如下所示：

Since the channel information BGR can be isolated, and then the order of BGR are merged, you can restore the original picture. If this order is not to merge, then what renderings will happen? You can find the answer from the last picture above.

Precautions

In use cv2.split (img) separating the channel information display directly using the following code:

b,g,r=cv2.split(img)
ShowImg(b,"Blue Channel")
ShowImg(g,"Green Channel")
ShowImg(r,"Red Channel")

If the display in the above manner, but will get three different grayscale:

13BGR grayscale .png

BGR has been isolated above three channels, why not three images of BGR? Reason are as follows:

When calling imshow (b), is the value of the image BGR three channels are changed to the value of b, the values are passed three channels (b, b, b), a first said before, if three as the value of channels, compared with grayscale. When a channel matrix used in combination with the zero cv2.merge method, is formed (b, 0,0) to display only the color information of one channel.

This article posted on the micro-channel synchronous subscription number, small partners as you like my articles, you can also concerned about my micro-channel subscription number: woaitest, or scan the following QR code to add attention: