Chapter 1 - Digital Image Basics

Front

According to the information searched on the Internet, the resolution of the human eye is equivalent to about 576 million pixels, and the central vision of the eyes is equivalent to 7 million pixels. The picture is transmitted to the brain through the eyes to form a complete picture. A digital image is composed of dots one after another in the computer, and these dots are called pixels.

Representation of black and white images

The image is composed of 0 and 1, black is 0, white is 1, the visual difference is caused by the difference between black and white to produce different picture information, such a picture becomes a binary picture.

Under normal circumstances, a byte (8 bits) is used to represent a pixel. According to the value range of 00000000~11111111, which is [0,255], 8 bits can be used to represent 256 colors, that is, pure black + 254 different black and white The ratio of gray + pure white, using 256 kinds of grayscale images such as black, white and gray, becomes a grayscale image.

The byte is the basic unit of storage. For the convenience and consistency of processing, the binary image only contains 0 and 255, while the value range of the grayscale image is a collection of 0 to 255.

In the process of image processing, the processing result of the pixel value of the pixel may exceed 255, so there are two different processing methods:

  1. Modulo processing, processing values ​​and 256 modulo processing
  2. Saturation processing, if the processing result exceeds 255, it will take 255, if it does not exceed 255, it will be the value itself

The above different processing methods are specifically reflected in the subsequent processing methods of numpy and opencv when the pixel value exceeds 255.

Digital images are stored in a matrix (array) in the computer, and each element has its own position value, which is used to represent the row number and column number. In opencv, the origin of the image coordinates is in the upper left corner, the origin to the right is the positive direction of the x-axis, and the origin is downward to the positive direction of the y-axis.

The row and height of the image used in image processing have the same meaning, and the column and width of the image have the same meaning.

Representation of color images

The image is not only black and white gray, but also a variety of brilliant colors.

Optical primary colors (red-green-blue). The optical three primary colors are mixed in different proportions to form various colors that can be displayed on the display screen, so this method also becomes the RGB color space.

R, G, and B respectively correspond to the size of the three color components, each component value is [0,255], so RGB can display a total of 256*256*256=16777216 different colors, far beyond the range that the naked eye can perceive .

Normally, when a computer stores or calculates pixels in RGB mode, it stores the value of each color component separately, that is, there are R channel, G channel and B channel in the RGB color space.

It is roughly a 512*512 color picture composed of three 512*512 thin papers that store the R component, the G component, and the B component respectively. The three thin papers are stacked in a specific order (RGB) to form a color image. , These three thin papers are called R channel, G channel, and B channel respectively.

some other concepts

  1. Quantization: Converting pictures into numerical values ​​that computers can understand and process becomes quantization, so all RGB channels and each channel have specific color values.
  2. Features: When doing face recognition, you need to find out the position of the face first, and the range of the face is the feature. When doing face discrimination, you need to extract the key features of the face for comparison.
  3. Distance: Use the distance to measure the difference between images, which is convenient for distinguishing and identifying.

Manhattan distance: the sum of the absolute values ​​of the differences of the characteristics of each point |x1-x2| + |y1-y2|

Euclidean distance: the sum of the squares of the differences of the characteristics of each point, and then open the root sign √ (|x1-x2|²+|y1-y2|²)

Image Identification

Face recognition icon:

Search for pictures by picture:

Digital identification icon:

The general process of image recognition:

information hiding

The hiding of digital information is realized according to a certain algorithm.

Intelligent Image Processing Fundamentals

Select appropriate features: Highly summarize image characteristics and reflect differences between different images

Appropriate quantification method: quantify features into reasonable values

Distance calculation: choose the appropriate distance calculation method to calculate the distance

traditional way

Extract features by yourself and process them yourself

machine learning method

Extract features by yourself and process them automatically

deep learning method

Automatically extract high-level features and automatically process features

Guess you like

Origin blog.csdn.net/sunguanyong/article/details/129134089
Recommended