The principle of similar image search (2)

Two years ago, I wrote "The Principle of Similar Image Search" , which introduced one of the simplest implementation methods.

Yesterday, I saw on isnowfy 's website that there are two other methods that are also very simple, here are some notes.

1. Color distribution method

A color histogram of the color distribution can be generated for each image . Two images can be considered similar if their histograms are close.

Any color is composed of red, green and blue three primary colors (RGB), so there are 4 histograms in the above picture (three primary color histogram + final composite histogram).

If each primary color can take 256 values, there are 16 million colors in the entire color space (256 to the power). Comparing the histograms for these 16 million colors is too computationally intensive, so a simplified method is required. 0-255 can be divided into four zones: 0-63 is zone 0, 64-127 is zone 1, 128-191 is zone 2, and 192-255 is zone 3. This means that there are 4 areas for red, green and blue, and a total of 64 combinations (4 to the power of 3) can be formed.

Any color must belong to one of these 64 combinations, so that the number of pixels contained in each combination can be counted.

The above picture is the color distribution table of a picture. The last column in the table is extracted to form a 64-dimensional vector (7414, 230, 0, 0, 8, ..., 109, 0, 0, 3415, 53929) . This vector is the eigenvalue or "fingerprint" of the image.

Thus, finding similar images becomes finding the most similar vectors. This can be calculated using the Pearson correlation coefficient or cosine similarity .

2. Content feature method

In addition to color composition, you can also start by comparing the similarity of image content.

First, convert the original image to a smaller grayscale image, let's say 50x50 pixels. Then, a threshold is determined to convert the grayscale image to black and white.

  

If two images are similar, their black and white outlines should be similar. So, the question becomes, how to determine a reasonable threshold in the first step to correctly present the contours in the photo?

Obviously, the greater the contrast between the foreground color and the background color, the more obvious the outline. This means that if we find a value that "minimizes the intra-class variance", or "maximizes the inter-class variance" of the foreground and background colors, respectively, Then this value is the ideal threshold.

In 1979, Japanese scholar Otsu Zhanzhi proved that "the smallest difference within a class" and "the largest difference between classes" are the same thing, that is, they correspond to the same threshold. He proposed a simple algorithm to find this threshold, which is called " Otsu's method". Below is his calculation method.

Suppose a picture has n pixels in total, of which n1 pixels have a gray value less than the threshold, and n2 pixels have a gray value greater than or equal to the threshold ( n1 + n2 = n ). w1 and w2 represent the respective weights of these two kinds of pixels.

  w1 = n1 / n

  w2 = n2 / n

Assume again that the mean and variance of all pixels with gray values ​​less than the threshold are μ1 and σ1, respectively, and the mean and variance of all pixels with gray values ​​greater than or equal to the threshold are μ2 and σ2, respectively. Therefore, it can be obtained

  Within-Class Difference = w1(square of σ1) + w2(square of σ2)

  Difference between classes = w1w2(μ1-μ2)^2

It can be shown that these two formulas are equivalent: getting the minimum value of "within-class difference" is equivalent to obtaining the maximum value of "between-class difference". However, in terms of computational difficulty, the latter calculation is easier.

The next step is to use the "exhaustive method" to take the threshold from the lowest value to the highest value of grayscale, and take it one by one, and substitute them into the above formula. The value that makes the "minimum intra-class difference" or "maximum inter-class difference" is the final threshold. For specific examples and Java algorithms, please see here .

Having a 50x50 pixel black and white thumbnail is equivalent to having a 50x50 0-1 matrix. Each value of the matrix corresponds to a pixel of the original image, 0 for black and 1 for white. This matrix is ​​the feature matrix of an image.

The less difference between the two feature matrices, the more similar the two images are. This can be done with an "exclusive OR" (ie if only one of the two values ​​is 1, the result of the operation is 1, otherwise the result is 0). Perform an "exclusive OR operation" on the feature matrices of different pictures, the less 1 in the result, the more similar pictures.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324431721&siteId=291194637