Introduction to Image Processing

Image processing is the technique of analyzing images with computers to achieve the desired results. Also known as image processing. Image processing generally refers to digital image processing. A digital image refers to a large two-dimensional array obtained by shooting with industrial cameras, video cameras, scanners and other equipment. The elements of this array are called pixels, and their values ​​are called grayscale values . Image processing technology generally includes three parts: image compression, enhancement and restoration, matching, description and recognition .

image profile

The 21st century is an era full of information. As the visual basis for human perception of the world, images are an important means for human beings to obtain, express and transmit information . Digital image processing, that is, processing images with computers, has a short history of development. Digital image processing technology originated in the 1920s, when a photo was transmitted from London, UK to New York, USA through a submarine cable, using digital compression technology. First of all, digital image processing technology can help people understand the world more objectively and accurately. The human visual system can help humans obtain more than 3/4 of information from the outside world , and images and graphics are the carriers of all visual information. The power is very high, and thousands of colors can be recognized, but in many cases, the image is blurred or even invisible to the human eye. Through image enhancement technology, the blurred or even invisible image can be made clear and bright.

In a computer, images can be divided into four basic types: binary images, grayscale images, indexed images, and true-color RGB images according to the number of colors and grayscales. Most image processing software supports these four types of images.

The China Internet of Things School-Enterprise Alliance believes that image processing will be one of the important pillars for the development of the Internet of Things industry, and its specific application is fingerprint recognition technology.

common method

1) Image transformation : Since the image array is large, processing it directly in the spatial domain involves a large amount of calculation. Therefore, various image transformation methods are often used, such as Fourier transform, Walsh transform, discrete cosine transform and other indirect processing techniques, to convert the processing of the spatial domain into the processing of the transform domain, which can not only reduce the amount of calculation, but also obtain more effective processing (for example, Fourier transform can perform digital filtering in the frequency domain). The wavelet transform, which is newly researched at present, has good localization characteristics in both the time domain and the frequency domain, and it also has a wide and effective application in image processing.

2) Image coding and compression : Image coding and compression technology can reduce the amount of data describing an image (that is, the number of bits), so as to save image transmission and processing time and reduce the occupied memory capacity. Compression can be achieved without distortion or with distortion allowed. Coding is the most important method in compression technology, and it is the earliest and relatively mature technology in image processing technology.

3) Image enhancement and restoration : The purpose of image enhancement and restoration is to improve image quality, such as removing noise and improving image clarity. Image enhancement does not consider the cause of image degradation, and highlights the interesting part of the image. For example, strengthening the high-frequency components of the image can make the outline of objects in the image clear and the details obvious; if strengthening the low-frequency components, it can reduce the influence of noise in the image. Image restoration requires a certain understanding of the reasons for image degradation. Generally speaking, a "degradation model" should be established according to the degradation process, and then some filtering method should be used to restore or reconstruct the original image.

4) Image segmentation : Image segmentation is one of the key technologies in digital image processing. Image segmentation is to extract the meaningful features in the image. The meaningful features include edges and regions in the image, which is the basis for further image recognition, analysis and understanding. Although many methods of edge extraction and region segmentation have been researched, there is no effective method that is generally applicable to various images. Therefore, the research on image segmentation is still going deep, and it is one of the hot spots in image processing.

5) Image description : Image description is a necessary prerequisite for image recognition and understanding. As the simplest binary image, its geometric characteristics can be used to describe the characteristics of objects. The general image description method uses two-dimensional shape description, which has two types of boundary description and area description. Two-dimensional texture features can be used to describe special texture images. With the in-depth development of image processing research, research on three-dimensional object description has begun, and methods such as volume description, surface description, and generalized cylinder description have been proposed.

6) Image classification (recognition) : Image classification (recognition) belongs to the category of pattern recognition. Its main content is to perform image segmentation and feature extraction after some preprocessing (enhancement, restoration, compression) of the image, so as to perform judgment and classification. Image classification often adopts classic pattern recognition methods, including statistical pattern classification and syntactic (structural) pattern classification. In recent years, newly developed fuzzy pattern recognition and artificial neural network pattern classification have also received more and more attention in image recognition.

image

Binary image

A two-dimensional matrix of a binary image consists of only two values ​​of 0 and 1, where "0" represents black and "1" represents white. Since each pixel (each element in the matrix) has only two possible values, 0 and 1, the data type of the binary image in the computer is usually 1 binary bit. Binary images are usually used for scanning recognition (OCR) of text and line drawings and storage of mask images.

Grayscale image

The value range of grayscale image matrix elements is usually [0, 255]. Therefore, its data type is generally an 8-bit unsigned integer (int8), which is the 256 grayscale image that people often refer to. "0" means pure black, "255" means pure white, and the number in the middle means the transition color from black to white. In some software, grayscale images can also be represented by double-precision data type (double), the value range of pixels is [0, 1], 0 represents black, 1 represents white, and decimals between 0 and 1 represent different Gray scale. Binary images can be regarded as a special case of grayscale images.

index image

The file structure of the indexed image is relatively complicated. In addition to storing the two-dimensional matrix of the image, it also includes a two-dimensional array called the color index matrix MAP. The size of the MAP is determined by the value range of the matrix elements storing the image. For example, if the value range of the matrix elements is [0, 255], then the size of the MAP matrix is ​​256×3, represented by MAP=[RGB]. The three elements of each row in the MAP respectively specify the red, green, and blue monochrome values ​​of the corresponding color of the row, and each row in the MAP corresponds to a gray value of a pixel in the image matrix. For example, if the gray value of a pixel is 64, then The pixel has a mapping relationship with the 64th line in the MAP, and the actual color of the pixel on the screen is determined by the [RGB] combination of the 64th line. That is to say, when the image is displayed on the screen, the color of each pixel is obtained by searching the color index matrix MAP by using the gray value of the pixel stored in the matrix as an index. The data type of the index image is generally 8-bit unsigned integer (int8), and the size of the corresponding index matrix MAP is 256×3, so the general index image can only display 256 colors at the same time, but by changing the index matrix, the color type can be adjusted. The data type of the index image may also be a double-precision floating-point type (double). Indexed images are generally used to store images with relatively simple color requirements. For example, wallpapers with relatively simple color composition in Windows are usually stored in indexed images. If the color of the image is more complicated, RGB true color images must be used.

RGB color image

RGB images, like indexed images, can be used to represent color images. Like an indexed image, it uses a combination of red (R), green (G), and blue (B) primary colors to represent the color of each pixel. But different from the index image, the color value of each pixel of the RGB image (represented by the three primary colors of RGB) is directly stored in the image matrix. Since the color of each pixel needs to be represented by three components of R, G, and B, M, N respectively represent the number of rows and columns of the image, and three M x N two-dimensional matrices respectively represent the three color components of R, G and B of each pixel. The data type of an RGB image is generally an 8-bit unsigned integer, which is usually used to represent and store a true-color image, and of course a grayscale image can also be stored.

There are two storage methods for digitized image data: bitmap storage (Bitmap) and vector storage (Vector)

We usually describe digital images by image resolution (that is, pixels) and number of colors. For example, a digital picture with a resolution of 640*480 and 16-bit color consists of 307200 (=640*480) pixels of 2^16=65536 colors.

Bitmap image: The bitmap method is to convert each pixel value point of the image into a data. When the image is monochrome (only black and white and two colors), the data of 8 pixel value points only occupies one byte (one byte That is, 8 binary numbers, 1 binary number stores the pixel value point); 16-color (different from the previous "16-bit color") image is stored in one byte for every two pixel value points; each pixel value point of the 256-color image Stored in one byte. This enables accurate description of image surfaces in various color modes. Bitmap images make up for the defects of vector images. It can produce images with rich color and tone changes, which can realistically represent the natural scene. At the same time, it can easily exchange files between different software. This is the bitmap image. The advantages; and its disadvantages are that it cannot produce real 3D images, and image distortion will occur when zooming and rotating images. At the same time, the file size is large, and the demand for memory and hard disk space is also high. The bitmap method is to convert each pixel of the image into a data. If it is recorded with 1-bit data, it can only represent 2 colors (2^1=2); if it is recorded with 8 bits, it can express 256 colors or tones (2^8=256), so use The more bit elements there are, the more colors can be expressed. Usually the colors we use are 16-color, 256-color, enhanced 16-bit and true color 24-bit. Generally speaking, true color refers to a 24-bit (2^24) bitmap storage mode suitable for complex images and real photos. However, as the resolution and the number of colors increase, the disk space occupied by the image will be quite large; in addition, due to the process of enlarging the image, the image will inevitably become blurred and distorted, and the pixels of the enlarged image will actually become larger. Become a pixel "square". Images captured with digital cameras and scanners are bitmaps.

Vector image: The vector image stores the contour part of the image information, rather than each pixel value point of the image. For example, a circular pattern only needs to store the coordinate position and radius length of the center of the circle, as well as the edge and inner color of the circle. The disadvantage of this storage method is that it often takes a lot of time to do some complex analysis and calculation work, and the display speed of the image is slow; but the image zoom will not be distorted; the storage space of the image is much smaller. Therefore, vector graphics are more suitable for storing various diagrams and projects.

data

Image processing is inseparable from massive and rich basic data, including video, static images and other formats, such as Berkeley segmentation data set and benchmark 500 (BSDS500), Simon Fraser University image database of different lighting objects, neural network face recognition data , CBCL-MIT StreetScenes (MIT Street View Database), etc.

Digitizing

Transform an image that exists in its natural form into a digital form suitable for computer processing through the process of sampling and quantization. An image is represented inside a computer as a matrix of numbers, and each element in the matrix is ​​called a pixel. Image digitization requires specialized equipment, commonly used are various electronic and optical scanning equipment, as well as electromechanical scanning equipment and manually operated digitizers.

image coding

Encode the image information to meet the requirements of transmission and storage. Coding compresses the amount of information in an image, but the image quality remains almost the same. To this end, analog processing technology can be used, and then coded through analog-to-digital conversion, but most of them use digital coding technology. Coding methods include the method of processing the image point by point, and the method of applying some transformation to the image or encoding based on regions and features. Pulse code modulation, differential pulse code modulation, predictive codes, and various transformations are commonly used coding techniques.

Image Compression

The data volume of an image obtained by digitization is very huge, and a typical digital image usually consists of 500×500 or 1000×1000 pixels. If it is a dynamic image, its data volume is larger. Therefore, image compression is very necessary for image storage and transmission.

There are two types of compression algorithms for image compression, namely lossless compression and lossy compression. The most commonly used lossless compression algorithm takes the difference of adjacent pixel values ​​in space or time, and then encodes. Run-length codes are examples of such compressed codes. Most of the lossy compression algorithms adopt the way of image exchange, such as performing fast Fourier transform or discrete cosine transform on the image. Both JPEG and MPEG, which have been used as international standards for image compression, are lossy compression algorithms. The former is used for static images and the latter for dynamic images. They are all realized by chips.

enhanced recovery

The goal of image enhancement is to improve the quality of the image, such as increasing contrast, removing blur and noise, correcting geometric distortion, etc.; image restoration is a technique that attempts to estimate the original image assuming a model with known blur or noise.

Image enhancement can be divided into frequency domain method and space domain method according to the method used. The former regards the image as a two-dimensional signal, and performs signal enhancement based on two-dimensional Fourier transform. Using low-pass filtering (that is, allowing only low-frequency signals to pass) can remove the noise in the picture; using high-pass filtering can enhance high-frequency signals such as edges and make blurred pictures clear. Representative spatial domain algorithms include local averaging method and median filtering (take the intermediate pixel value in the local neighborhood), etc., which can be used to remove or weaken noise [3]   .

Early digital image restoration also came from the concept of frequency domain. The modern method is an algebraic method, that is, to restore the ideal picture by solving a large system of equations.

Image enhancement and restoration for the purpose of improving image quality are widely used for some difficult-to-obtain pictures or pictures obtained under very bad shooting conditions. For example, photos of the earth or other planets taken from space, biomedical pictures taken with electron microscopes or X-rays, etc.

Image enhancement  sharpens an image or transforms it into a form more suitable for human or machine analysis. Unlike image restoration, image enhancement does not require faithful reflection of the original image. Conversely, an image that contains some kind of distortion, such as prominent contour lines, may be sharper than the undistorted original image. Commonly used image enhancement methods are: ①Gray level histogram processing: to make the processed image have better contrast in a certain gray range; ②Interference suppression: through low-pass filtering, multi-image averaging, and implementing certain types Domain operator and other processing to suppress the random interference superimposed on the image; ③Edge sharpening: through high-pass filtering, differential operation or some kind of transformation, the outline of the graphic is enhanced; ④Pseudo-color processing: Convert black and white images to color images, so that people can easily analyze and detect the information contained in the images.

Image restoration  removes or reduces the degradation of various causes in the image acquisition process. Such reasons may be aberration or defocus of the optical system, relative motion between the camera system and the object, noise of the electronic or optical system, and atmospheric turbulence between the camera system and the object. There are two commonly used methods for image restoration. When the nature of the image itself is not known, the mathematical model of the degradation source can be established, and then the restoration algorithm can be implemented to remove or reduce the influence of the degradation source. When there is prior knowledge about the image itself, a model of the original image can be established, and then the image can be restored by detecting the original image in the observed degraded image.

Image segmentation divides the image into some non-overlapping regions, each region is a continuous set of pixels. Generally, the area method that divides pixels into specific areas and the boundary method that seeks the boundaries between areas are used. The region method performs threshold operation according to the contrast between the segmented object and the background, and separates the object from the background. Sometimes a satisfactory segmentation cannot be obtained with a fixed threshold, and the threshold can be adjusted according to the local contrast, which is called an adaptive threshold. Boundary method uses various edge detection techniques, that is, detects according to the large gradient value at the edge of the image. Both of these methods can utilize the texture characteristics of the image to achieve image segmentation.

morphology

The term morphology generally refers to the branch of biology that deals with the shape and structure of animals and plants. The term is also used in the context of mathematical morphology as a tool for extracting image components that are useful in representing and describing the shape of regions such as boundaries, bones, and convex hulls. Additionally, we focus on morphological techniques for pre- and post-processing, such as morphological filtering, thinning, and clipping.

Basic Operations of Mathematical Morphology

There are four basic operations in mathematical morphology: corrosion, dilation, opening and closing. The mathematical morphology method uses a "probe" called a structural element to collect image information. When the probe is constantly moving in the image, the relationship between various parts of the image can be examined to understand the structural characteristics of the image. In the continuous space, the erosion, dilation, opening and closing operations of the grayscale image are respectively expressed as follows.

corrosion

Erosion "shrinks" or "thinns" objects in a binary image. The manner and degree of shrinkage is controlled by a structuring element. Mathematically, A is corroded by B, denoted as AΘB, defined as:

corrosion operation

In other words, the erosion of A by B is the set of origin positions of all structural elements, where the translated B and the background of A are not superimposed.

expand

Expansion

Dilation is the operation of "lengthening" or "thickening" a binary image. This particular manner and degree of thickening is controlled by a collection called structuring elements. Structural elements are usually represented by a matrix of 0s and 1s. Mathematically, dilation is defined as a set operation. A is inflated by B, denoted as A⊕B, defined as: Among them, Φ is an empty set, and B is a structural element. In short, the dilation of A by B is the set of origin positions of all structural elements, where the mapped and translated B overlaps at least some parts of A. This translation of structural elements during dilation is similar to spatial convolutions.

Expansion satisfies the commutative law, that is, A⊕B=B⊕A. In image processing, we are used to let the first operand of A⊕B be an image, and the second operand is a structural element, and the structural element is often much smaller than the image.

Expansion satisfies the associative law, that is, A⊕(B⊕C)=(A⊕B)⊕C. Suppose a structural element B can be expressed as the expansion of two structural elements B1 and B2, that is, B=B1⊕B2, then A⊕B=A⊕(B1⊕B2)=(A⊕B1)⊕B2, in other words, use B Expanding A is equivalent to first expanding A with B1, and then expanding the previous result with B2. We say that B can be decomposed into two structural elements, B1 and B2. The associativity is important because the time required to compute the dilation is proportional to the number of non-zero pixels in the structuring element. Through the associative law, decomposing structural elements and then performing expansion operations with substructural elements will often achieve a very objective speed increase.

turn on

open operation

The morphological opening operation of A by B can be recorded as A? B. This operation is the result of A being corroded by B and then expanded and corroded by B, namely:

The mathematical formula for the opening operation is:

open operation

Among them, ∪{·} refers to the union of all the sets in curly brackets. A simple geometric interpretation of this formula is: A−B is the union of the translations of B's ​​exact matching within A. The morphological opening operation completely removes object regions that cannot contain structural elements, smoothes object contours, breaks narrow connections, and removes small protrusions.

closure

A is recorded as A B by B morphological closing operation, which is the result of first expansion and then corrosion:

Close operation

Geometrically, A B is the union of all translations of B that do not overlap A. Like the opening operation, the morphological closing operation smoothes the outline of the object. Then, unlike the opening operation, the closing operation generally connects narrow gaps to form slender bends and fills holes smaller than structural elements.

Based on these basic operations, various mathematical morphology practical algorithms can be derived and combined, which can be used to analyze and process image shape and structure, including image segmentation, feature extraction, boundary detection, image noise reduction, image enhancement and restoration, etc.

image analysis

Extract some useful measure, data or information from an image . The purpose is to get some kind of numerical result, not to produce another image . The content of image analysis overlaps with the research fields of pattern recognition and artificial intelligence, but image analysis is different from typical pattern recognition. Image analysis is not limited to classifying a specific area in an image into a fixed number of categories, it mainly provides a description of the image being analyzed . To this end, it is necessary to use both pattern recognition technology and the knowledge base of image content, that is, the content of knowledge expression in artificial intelligence. Image analysis needs to use image segmentation method to extract the features of the image, and then describe the image symbolically. This description can not only answer whether there is a certain object in the image, but also describe the content of the image in detail.

The various contents of image processing are related to each other. A practical image processing system often combines and applies several image processing techniques to obtain the required results. Image digitization is the first step in transforming an image into a form suitable for computer processing. Image coding technology can be used to transmit and store images. Image enhancement and restoration can be the final purpose of image processing, or it can be a preparation for further processing. The image features obtained through image segmentation can be used as the final result, and can also be used as the basis for the next image analysis.

Image matching, description and recognition compares and registers images, extracts image features and interrelationships through scoring, obtains symbolic descriptions of images, and then compares them with models to determine their classification. Image matching attempts to establish a geometric correspondence between two images, measuring how similar or different they are. Matching is used for registration between pictures or between pictures and maps, such as detecting changes in scenes between pictures taken at different times, and finding out the trajectory of moving objects.

Extracting some useful metrics, data or information from images is called image analysis. The basic steps of image analysis are to divide the image into some non-overlapping regions, each region is a continuous set of pixels, measure their properties and relationships, and finally compare the obtained image relationship structure with the model describing the classification of the scene, to determine its type. The basis of recognition or classification is the similarity of images. A simple similarity can be defined by the distance in the region feature space. Another similarity measure based on pixel values ​​is the correlation of image functions. The last type of similarity defined on relational structures is called structural similarity.

Segmentation, description and recognition for the purpose of image analysis and understanding will be used in various automated systems, such as character and pattern recognition, assembly and inspection of products with robots, automatic military target recognition and tracking, fingerprint recognition, X-ray photographs And automatic processing of blood samples, etc. In such applications, it is often necessary to comprehensively apply technologies such as pattern recognition and computer vision, and image processing appears more as a pre-processing.

The rise of multimedia applications has greatly promoted the application of image compression technology. Images, including dynamic images such as video tapes, will be converted into digital images, stored in the computer together with text, sound, and graphics, and displayed on the computer screen. Its application will expand to new fields such as education, training and entertainment.

application

photography and printing

Satellite image processing

Medical image processing

Face recognition, feature recognition (Face detection, feature detection, face identification)

Microscope image processing

Car barrier detection

common software

Adobe Photoshop

Software Features: The image processing software with the highest popularity and utilization rate.

Software Advantages: Get better results faster with industry-standard Adobe Photoshop CS software, while providing essential new features for graphic and web design, photography, and video.

Comparison with peer software: This time Adobe has indeed brought great surprises to designers. Photoshop CS has added many powerful functions, especially for photographers. This time it has greatly broken through the previous Photoshop series. The product pays more attention to the limitations of graphic design, and has greatly strengthened and broken through the support function of the digital darkroom.

Recent version: On November 2, 2016, Adobe updated the latest version of its Photoshop CC 2017.

Adobe Illustrator

Software Features : Professional vector drawing tool with powerful functions and friendly interface.

Software advantages: Whether you are a designer and professional illustrator producing line drafts for printing and publishing, an artist producing multimedia images, or a producer of Internet pages or online content, you will find that Illustrator is not only a tool for art products, but also suitable for most From small designs to large complex projects.

Comparison with peer software : the function is extremely powerful, and the operation is quite professional. It is well compatible with Adobe's other software such as Photoshop, Primere and Indesign, and has obvious advantages in the professional field.

CorelDRAW

Software features: friendly interface design, wide space, delicate operation. Good compatibility.

Software advantage: Extraordinary design ability is widely used in many fields such as trademark design, logo making, model drawing, illustration drawing, typesetting and color separation output. Market-leading file compatibility and high-quality content help you turn your ideas into professional productions. Everything from distinctive logos and emblems to eye-catching marketing materials and eye-pleasing web graphics.

Comparison with peer software : powerful function, excellent compatibility, can generate various formats compatible with other software, easier to operate than Illustrator, and has a high application rate in domestic small and medium-sized advertising design companies.

Cow Image

Software features: Keniu Image is a new generation of image processing software, with unique functions such as whitening and acne removal, face slimming, star scenes, multi-photo overlay, etc., and more than 50 photo special effects, which can create studio-level professional photos in seconds .

Software advantages: photo editing, portrait beauty, scene calendar, adding watermark accessories, adding various artistic fonts, making dynamic flash pictures, bobblehead dolls, multi-picture stitching, all the functions that people can think of, and easy to use.

Comparison with peer software: scene calendar, dynamic flash, bobblehead doll, etc. are not available in traditional image processing software. With Keniu Imaging, there is no need for professional skills to process photos like photoshop.

Shadow magic hands

Software Features: "nEO iMAGING"〖Light and Shadow Magic Hand〗is a software for improving the quality of digital photos and processing effects. Simple and easy to use, you can create professional film photography color effects without any professional image technology.

Software advantage: Simulate the effect of reversal film, make the photo contrast more vivid and the color brighter, simulate the effect of reverse negative process, the color is strange and novel, simulate the effect of many types of black and white film, in terms of contrast, contrast, and digital photos completely different.

Comparison with peer software: It is a software for photo quality improvement and personalized processing. Simple and easy to use, everyone can create beautiful photo frames, artistic photos, professional film effects, and it's completely free.

ACDSee

Software features: No matter what type of photos you take - family and friends, or artistic photos taken as a hobby - you need photo management software to organize and view, correct and share them quickly and easily.

Software advantages: ACDSee 9 can quickly "get photos" from any storage device, and use the new feature of password-protected "private folders" to store confidential information.

Comparison with peers: powerful e-mail options, slideshows, CD/DVD burning, and web album tools that make sharing photos a breeze. Improve photos with quick fixes like red-eye removal, color cast removal, exposure adjustments, and the Photo Fix tool.

Macromedia Flash

Software Features: A visual web page design and website management tool that supports the latest web technology, including HTML inspection, HTML format control, HTML formatting options, etc.

Software advantages: In addition to new video and animation features, it also provides new drawing effects and better script support, and also integrates popular video editing and coding tools, and also provides software to allow users to test Flash content in mobile phones, etc. new function.

Comparison with peer software: In editing, you can choose the visualization method or the source code editing method you like.

Ulead GIF Animator

Software features: Animated GIF production software published by Youli Company, the built-in Plugin has many ready-made special effects that can be applied immediately, can convert AVI files into animated GIF files, and can also optimize animated GIF pictures, which can make your Animated GIF images placed on web pages lose weight so that people can browse the web more quickly.

Software Advantages: This is a very handy GIF animation production software created by Ulead Systems.Inc. Ulead GIF Animator can not only save a series of pictures as GIF animation format, but also generate more than 20 kinds of 2D or 3D dynamic effects, which are enough to meet your requirements for making web animation.

Comparison with peer software: Different from other graphics file formats, a GIF file can store multiple pictures. At this time, GIF will display the pictures stored in it in turn like a slide show, thus forming a period of animation.

Reposted from: Baidu Encyclopedia - Verification

Guess you like

Origin blog.csdn.net/fuhanghang/article/details/132575220