Analysis of traditional machine learning algorithms (opencv implementation)

foreword

The text mainly analyzes some small algorithms and ideas in traditional machine learning, which are only a small part of traditional machine learning algorithms. For more traditional machine learning algorithms, please refer to my other blogs. Link 1: PCA Principal Component Analysis
Link 2
: Canny edge detection algorithm
link 3: K-Means clustering algorithm
link 4: SIFT algorithm analysis

1. opencv

  • OpenCV is an open source computer vision library available from http://opencv.org.
  • The OpenCV library is written in C and C++ and can run on Windows, Linux, Mac OS X and other systems. At the same time, it is also actively developing interfaces for Python, Java, Matlab, and other languages, and importing libraries into Android and iOS to develop applications for mobile devices.
  • OpenCV is designed for efficient computing, with a strong emphasis on the development of real-time applications. It is written in C++ and deeply optimized to take advantage of multithreading.
  • One of the goals of OpenCV is to provide an easy-to-use computer vision interface that helps people quickly create sophisticated vision applications.
  • The OpenCV library contains more than 500 functions derived from various fields of computer vision, including industrial product quality inspection, medical image processing, security fields, interactive manipulation, camera calibration, binocular vision, and robotics

Because the underlying implementation of opencv is written in C++, the biggest advantage of C++ lies in its high efficiency. The same type of functions run at different speeds in different libraries.

The BGR of the opencv pit
The channel arrangement of opencv for the read-in pictures is BGR, not the mainstream RGB! Remember!

The matrix read by #opencv is BGR. If you want to convert it to RGB, you can convert it like this
img4 = cv2.imread('1.jpg')
img4 = cv2.cvtColor(img4,cv2.COLOR_BGR2RGB)

important point

  1. Except that the color pictures read by opencv are stored in BGR order, all other image libraries read in color pictures are stored in RGB.
  2. Except that the pictures read by PIL are of img class, the pictures read by other libraries are all in numpy matrix.
  3. The performance of the major image libraries, the best OpenCv, whether it is speed or the comprehensiveness of image operations, is a crushing existence. After all, it is a huge cv-specific library.

The following is a set of experimental data, a 3120*4160 image, the time it takes to run 100 times
insert image description here

2. Linear regression

What is linear regression?
For example, when the profit of a commodity is 2 yuan, 5 yuan, and 10 yuan, it is 4 yuan, 10 yuan, and 20 yuan respectively. We can easily conclude that the relationship between the commodity's profit and selling price is in line with the line: y= 2x. In the simple linear regression equation above, we call "2" the regression coefficient, that is, the slope is its regression coefficient. The regression coefficient indicates the relationship between the profit (y) and the corresponding change of the selling price (x) of the commodity by one unit.

insert image description here

Linear regression indicates which straight line is "closest" to these discrete points overall. Similar to the concept of mean.

least square method

  • The key point isMinimize the sum of squares of the error
  • Suppose we now have a series of data points (xi,yi) (i=1,...,m), then the estimator obtained by the fitting function h(x) we give is h(xi)
  • Residual: ri = h(xi) – yi
  • The goal we want to pursue is to minimize the sum of squared residuals

From this, we can write the definition of the least squares method:
insert image description here

This is an unconstrained optimization problem. The partial derivatives are calculated for k and b respectively, and then the partial derivatives are set to 0 to obtain the extreme points.
insert image description here

3. RANSAC

  • Random Sampling Consistency
  • Notice,RANSAC is an idea, a framework for solving for the parameters of a known model. It does not limit a specific problem, it can be a computer vision problem, it can also be a statistical mathematics problem, and it can even be a model parameter estimation problem in the field of economics.
  • It is an iterative method for estimating the parameters of a mathematical model in a set of observed data containing outliers. RANSAC is a non-deterministic algorithm in the sense that it produces a reasonable result with a certain probability, which allows more iterations to increase its probability.
  • The basic assumption of RANSAC is that "inner group" data can describe its data distribution through several sets of model parameters, while "outlier" data is data that is not suitable for modeling. Data can be affected by noise, which refers to outliers, such as from extreme noise or misinterpretation of measurements or incorrect assumptions about the data. RANSAC assumes that, given a (usually small) set of ingroups, there exists a procedure that can estimate the parameters that best explain or best fit the data model.

Note that RANSAC is just an idea. It is a framework for finding known model parameters. There is no limit to known models. It can be any model, but a model will have parameters, such as y = ax + b . How much a and b are equal to is the problem that RANSAC needs to solve, that is to say, whether you are y = ax + b, or y = kx + c, or z = ax + by + c. RANSAC doesn't care, as long as you follow its method, you can find the parameters. For example, it takes 3 steps to put an elephant in the refrigerator, so we only care about these three steps regardless of whether we can put it in, because the refrigerator at this time is equivalent to a black box. So whether we put the elephant in the refrigerator or the plane in the refrigerator, RANSAC doesn't care, RANSAC only cares about these three steps, and these three steps will not be different because of different objects.

RANSAC and Least Squares

  • Data in production practice often have certain deviations.
  • For example, we know that there is a linear relationship between two variables X and Y, Y=aX+b, and we want to determine the specific values ​​of parameters a and b. Through experiments, a set of test values ​​of X and Y can be obtained. Although theoretically the equation of two unknowns only needs two sets of values ​​to confirm, but due to systematic errors, the values ​​of a and b calculated by randomly taking two points are not the same. What we hope is that the final calculated theoretical model has the smallest error with the test value.
  • Least squares method: By calculating the value of the minimum mean square error with respect to the partial derivatives of parameters a and b being zero. In fact, in many cases, the method of least squares is synonymous with linear regression.
  • Unfortunately, the method of least squares is only suitable for small errors.
  • When the model is determined and the maximum number of iterations allows, RANSAC can always find the optimal solution. (For datasets containing 80% error, RANSAC far outperforms direct least squares.)
  • Due to the large number of pixels in a picture, the calculation amount of the least square method is large and the calculation speed is slow.

insert image description here
Looking at the picture above, if you use the least squares method for fitting, you will get the red line, which obviously deviates greatly from the expected effect. This is because the least squares method is not sensitive to noise points, so the least squares method is only suitable for small errors, and we can use the RANSAC method to get a reasonable solution to the RANSAC step RANSAC algorithm
input
:

  1. A set of observed data (often containing large noise or invalid points)
  2. A parametric model to explain the observed data, say y=ax+b (i.e. the model is known)
  3. some credible arguments
  1. Randomly select a few points in the data and set them as ingroups
  2. Calculate the model suitable for the inner group eg y=ax+b ->y=2x+3 y=4x+5
  3. Bring other points not selected just now into the model just established, and calculate whether it is an inner group eg hi=2xi+3->ri
  4. Write down the number of ingroups
  5. Repeat the above steps
  6. Compare which calculation has the largest number of ingroups, and the model built in the time with the most ingroups is the solution we require

Note: The mathematical models corresponding to different problems are different, so the methods must be different when calculating the model parameters. The role of RANSAC is not to calculate the model parameters. (This leads to the disadvantage of ransac in that the mathematical model is required to be known)
There are several problems here:

  1. How many points we want to randomly select at the beginning (n)
  2. and how many times to repeat (k)

Parameter determination of RANSAC

  • Assume that the probability that each point is a true ingroup is w:
    w = number of ingroups/(number of ingroups + number of outgroups)
  • Usually we don’t know what w is, w^n is the probability that the selected n points are all ingroups, 1-w^n is the probability that at least one of the selected n points is not an ingroup, (1 − w n) k is the probability that not all n points are in-groups after repeating k times, assuming that the probability of success after running the algorithm k times is p, then: 1 − p = (1 − w n)
    k p =
    1 − (1 − w n) k
  • We can obtain the number of extractions K through P back calculation, K=log(1-P)/log(1-w^n).
  • So if you want a high probability of success:
  • When n is constant, the larger k is, the larger p is; when w is constant, the larger n is, the larger k is required.
  • Usually w is unknown, so it is better to choose a smaller value for n.

Advantages and disadvantages of RANSAC Advantages
:

  1. It can robustly estimate model parameters. For example, it can estimate parameters with high accuracy from datasets containing a large number of outliers.

shortcoming:

  1. There is no upper limit to the number of iterations it calculates parameters; if the upper limit of iterations is set, the results obtained may not be the optimal results, and may even get wrong results.
  2. RANSAC only has a certain probability to get a credible model, and the probability is proportional to the number of iterations.
  3. It requires setting thresholds that are relevant to the problem.
  4. RANSAC can only estimate one model from a specific data set, if there are two (or more) models, RANSAC cannot find another model.
  5. The mathematical model is required to be known

Code

import numpy as np
import scipy as sp
import scipy.linalg as sl
 
def ransac(data, model, n, k, t, d, debug = False, return_all = False):
    """
    输入:
        data - 样本点
        model - 假设模型:事先自己确定
        n - 生成模型所需的最少样本点
        k - 最大迭代次数
        t - 阈值:作为判断点满足模型的条件
        d - 拟合较好时,需要的样本点最少的个数,当做阈值看待
    输出:
        bestfit - 最优拟合解(返回nil,如果未找到)
    
    iterations = 0
    bestfit = nil #后面更新
    besterr = something really large #后期更新besterr = thiserr
    while iterations < k 
    {
        maybeinliers = 从样本中随机选取n个,不一定全是局内点,甚至全部为局外点
        maybemodel = n个maybeinliers 拟合出来的可能符合要求的模型
        alsoinliers = emptyset #满足误差要求的样本点,开始置空
        for (每一个不是maybeinliers的样本点)
        {
            if 满足maybemodel即error < t
                将点加入alsoinliers 
        }
        if (alsoinliers样本点数目 > d) 
        {
            %有了较好的模型,测试模型符合度
            bettermodel = 利用所有的maybeinliers 和 alsoinliers 重新生成更好的模型
            thiserr = 所有的maybeinliers 和 alsoinliers 样本点的误差度量
            if thiserr < besterr
            {
                bestfit = bettermodel
                besterr = thiserr
            }
        }
        iterations++
    }
    return bestfit
    """
    iterations = 0
    bestfit = None
    besterr = np.inf #设置默认值
    best_inlier_idxs = None
    while iterations < k:
        maybe_idxs, test_idxs = random_partition(n, data.shape[0])
        print ('test_idxs = ', test_idxs)
        maybe_inliers = data[maybe_idxs, :] #获取size(maybe_idxs)行数据(Xi,Yi)
        test_points = data[test_idxs] #若干行(Xi,Yi)数据点
        maybemodel = model.fit(maybe_inliers) #拟合模型
        test_err = model.get_error(test_points, maybemodel) #计算误差:平方和最小
        print('test_err = ', test_err <t)
        also_idxs = test_idxs[test_err < t]
        print ('also_idxs = ', also_idxs)
        also_inliers = data[also_idxs,:]
        if debug:
            print ('test_err.min()',test_err.min())
            print ('test_err.max()',test_err.max())
            print ('numpy.mean(test_err)',numpy.mean(test_err))
            print ('iteration %d:len(alsoinliers) = %d' %(iterations, len(also_inliers)) )
        # if len(also_inliers > d):
        print('d = ', d)
        if (len(also_inliers) > d):
            betterdata = np.concatenate( (maybe_inliers, also_inliers) ) #样本连接
            bettermodel = model.fit(betterdata)
            better_errs = model.get_error(betterdata, bettermodel)
            thiserr = np.mean(better_errs) #平均误差作为新的误差
            if thiserr < besterr:
                bestfit = bettermodel
                besterr = thiserr
                best_inlier_idxs = np.concatenate( (maybe_idxs, also_idxs) ) #更新局内点,将新点加入
        iterations += 1
    if bestfit is None:
        raise ValueError("did't meet fit acceptance criteria")
    if return_all:
        return bestfit,{
    
    'inliers':best_inlier_idxs}
    else:
        return bestfit
 
 
def random_partition(n, n_data):
    """return n random rows of data and the other len(data) - n rows"""
    all_idxs = np.arange(n_data) #获取n_data下标索引
    np.random.shuffle(all_idxs) #打乱下标索引
    idxs1 = all_idxs[:n]
    idxs2 = all_idxs[n:]
    return idxs1, idxs2
 
class LinearLeastSquareModel:
    #最小二乘求线性解,用于RANSAC的输入模型    
    def __init__(self, input_columns, output_columns, debug = False):
        self.input_columns = input_columns
        self.output_columns = output_columns
        self.debug = debug
    
    def fit(self, data):
		#np.vstack按垂直方向(行顺序)堆叠数组构成一个新的数组
        A = np.vstack( [data[:,i] for i in self.input_columns] ).T #第一列Xi-->行Xi
        B = np.vstack( [data[:,i] for i in self.output_columns] ).T #第二列Yi-->行Yi
        x, resids, rank, s = sl.lstsq(A, B) #residues:残差和
        return x #返回最小平方和向量   
 
    def get_error(self, data, model):
        A = np.vstack( [data[:,i] for i in self.input_columns] ).T #第一列Xi-->行Xi
        B = np.vstack( [data[:,i] for i in self.output_columns] ).T #第二列Yi-->行Yi
        B_fit = sp.dot(A, model) #计算的y值,B_fit = model.k*A + model.b
        err_per_point = np.sum( (B - B_fit) ** 2, axis = 1 ) #sum squared error per row
        return err_per_point
 
def test():
    #生成理想数据
    n_samples = 500 #样本个数
    n_inputs = 1 #输入变量个数
    n_outputs = 1 #输出变量个数
    A_exact = 20 * np.random.random((n_samples, n_inputs))#随机生成0-20之间的500个数据:行向量
    perfect_fit = 60 * np.random.normal( size = (n_inputs, n_outputs) ) #随机线性度,即随机生成一个斜率
    B_exact = sp.dot(A_exact, perfect_fit) # y = x * k
 
    #加入高斯噪声,最小二乘能很好的处理
    A_noisy = A_exact + np.random.normal( size = A_exact.shape ) #500 * 1行向量,代表Xi
    B_noisy = B_exact + np.random.normal( size = B_exact.shape ) #500 * 1行向量,代表Yi
 
    if 1:
        #添加"局外点"
        n_outliers = 100
        all_idxs = np.arange( A_noisy.shape[0] ) #获取索引0-499
        np.random.shuffle(all_idxs) #将all_idxs打乱
        outlier_idxs = all_idxs[:n_outliers] #100个0-500的随机局外点
        A_noisy[outlier_idxs] = 20 * np.random.random( (n_outliers, n_inputs) ) #加入噪声和局外点的Xi
        B_noisy[outlier_idxs] = 50 * np.random.normal( size = (n_outliers, n_outputs)) #加入噪声和局外点的Yi
    #setup model 
    all_data = np.hstack( (A_noisy, B_noisy) ) #形式([Xi,Yi]....) shape:(500,2)500行2列
    input_columns = range(n_inputs)  #数组的第一列x:0
    output_columns = [n_inputs + i for i in range(n_outputs)] #数组最后一列y:1
    debug = False
    model = LinearLeastSquareModel(input_columns, output_columns, debug = debug) #类的实例化:用最小二乘生成已知模型
 
    linear_fit,resids,rank,s = sp.linalg.lstsq(all_data[:,input_columns], all_data[:,output_columns])
    
    #run RANSAC 算法
    ransac_fit, ransac_data = ransac(all_data, model, 50, 1000, 7e3, 300, debug = debug, return_all = True)
 
    if 1:
        import pylab
 
        sort_idxs = np.argsort(A_exact[:,0])
        A_col0_sorted = A_exact[sort_idxs] #秩为2的数组
 
        if 1:
            pylab.plot( A_noisy[:,0], B_noisy[:,0], 'k.', label = 'data' ) #散点图
            pylab.plot( A_noisy[ransac_data['inliers'], 0], B_noisy[ransac_data['inliers'], 0], 'bx', label = "RANSAC data" )
        else:
            pylab.plot( A_noisy[non_outlier_idxs,0], B_noisy[non_outlier_idxs,0], 'k.', label='noisy data' )
            pylab.plot( A_noisy[outlier_idxs,0], B_noisy[outlier_idxs,0], 'r.', label='outlier data' )
 
        pylab.plot( A_col0_sorted[:,0],
                    np.dot(A_col0_sorted,ransac_fit)[:,0],
                    label='RANSAC fit' )
        pylab.plot( A_col0_sorted[:,0],
                    np.dot(A_col0_sorted,perfect_fit)[:,0],
                    label='exact system' )
        pylab.plot( A_col0_sorted[:,0],
                    np.dot(A_col0_sorted,linear_fit)[:,0],
                    label='linear fit' )
        pylab.legend()
        pylab.show()
 
if __name__ == "__main__":
    test()

result:
insert image description here

4. Image similarity comparison hash algorithm

There are three hashing algorithms for similar image search:

  1. Mean Hash Algorithm
  2. difference hash algorithm
  3. perceptual hashing algorithm

What is Hash?

  • A hash function (or hash algorithm, also known as a hash function, English: Hash Function) is a method of creating small digital "fingerprints" from any kind of data. The hash function compresses the message or data into a summary, which reduces the amount of data and fixes the format of the data. This function scrambles the data to recreate a fingerprint called hash values ​​(hash values, hash codes, hash sums, or hashes). The hash value is usually represented by a short string of random letters and numbers.
  • A binary value of arbitrary length obtained by a hash algorithm is mapped to a shorter fixed-length binary value, that is, a hash value. In addition, a hash value is a unique and extremely compact numerical representation of a piece of data. If you obtain a hash value by hashing a piece of plaintext, even if you only change any letter in the piece of plaintext, the resulting hash value will be different.
  • A hash algorithm is a function that converts almost any digital file into a seemingly garbled string of numbers and letters.

As an encryption function, the hash function has two most important characteristics:

  1. irreversibility. It is very easy to get the seemingly garbled string (hash value) of the output from the input information, but it is very, very difficult to deduce the input result from the output string
  2. Output value uniqueness and unpredictability. As long as the input information is slightly different, the output value obtained according to the hash algorithm is also very different.

Hamming Distance
The Hamming distance between two integers refers to the number of positions in which the corresponding binary bits of the two numbers differ.
insert image description here

Mean Hash Algorithm
Steps

  1. Scaling: The picture is scaled to 8*8, the structure is preserved, and the details are removed.
  2. Grayscale: Convert to a grayscale image.
  3. Average: Calculate the average of all pixels in the grayscale image.
  4. Comparison: if the pixel value is greater than the average value, it will be recorded as 1, otherwise, it will be recorded as 0, with a total of 64 bits.
  5. Generate hash: Combining the 1 and 0 generated in the above steps in order is the fingerprint (hash) of the picture
  6. Compare fingerprints: Compare the fingerprints of the two images and calculate the Hamming distance, that is, how many digits of the two 64-bit hash values ​​are different. The fewer the different digits, the more similar the images are.

Difference Hash Algorithm
Compared with the average hash algorithm, the difference hash algorithm is basically the same in the early stage and the late stage, only the hash in the middle is changed.
step

  1. Scaling: The picture is scaled to 8*9, the structure is preserved, and the details are removed.
  2. Grayscale: Convert to a grayscale image.
  3. Average: Calculate the average of all pixels in the grayscale image. —No, this step is just for comparison with the average hash
  4. Comparison: if the pixel value is greater than the next pixel value, it will be recorded as 1, otherwise it will be recorded as 0. This line does not compare with the next line, each line has 9 pixels, eight differences, there are 8 lines, a total of 64 bits
  5. Generate hash: Combining the 1 and 0 generated in the above steps in order is the fingerprint (hash) of the picture.
  6. Compare fingerprints: Compare the fingerprints of the two images and calculate the Hamming distance, that is, how many digits of the two 64-bit hash values ​​are different. The fewer the different digits, the more similar the images are.

Perceptual Hash Algorithm
The mean hash algorithm is too strict and not precise enough, and is more suitable for searching thumbnails. In order to obtain more accurate results, you can choose the perceptual hash algorithm, which uses DCT (discrete cosine transform) to reduce the frequency.
step:

  1. Reduce the picture: 32 * 32 is a better size, which is convenient for DCT calculation
  2. Convert to grayscale: Convert the scaled image to grayscale.
  3. Calculate DCT: DCT separates the picture into a set of ratios
  4. Reduced DCT: The matrix after DCT calculation is 32 * 32, and the 8 * 8 in the upper left corner is reserved, which represent the lowest frequency of the picture.
  5. Calculate the average value: calculate the average value of all pixels after the DCT is reduced.
  6. Further reduce the DCT: if it is greater than the average value, it will be recorded as 1, otherwise it will be recorded as 0.
  7. Get the information fingerprint: combine 64 information bits, and the order is random to maintain consistency.
  8. Finally, compare the fingerprints of the two images to obtain the Hamming distance.

Code implementation
Implementation of mean value hash and difference value hash:

import cv2
import numpy as np
 
#均值哈希算法
def aHash(img):
    #缩放为8*8
    img=cv2.resize(img,(8,8),interpolation=cv2.INTER_CUBIC)
    #转换为灰度图
    gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    #s为像素和初值为0,hash_str为hash值初值为''
    s=0
    hash_str=''
    #遍历累加求像素和
    for i in range(8):
        for j in range(8):
            s=s+gray[i,j]
    #求平均灰度
    avg=s/64
    #灰度大于平均值为1相反为0生成图片的hash值
    for i in range(8):
        for j in range(8):
            if  gray[i,j]>avg:
                hash_str=hash_str+'1'
            else:
                hash_str=hash_str+'0'            
    return hash_str
 
#差值算法
def dHash(img):
    #缩放8*9
    img=cv2.resize(img,(9,8),interpolation=cv2.INTER_CUBIC)
    #转换灰度图
    gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    hash_str=''
    #每行前一个像素大于后一个像素为1,相反为0,生成哈希
    for i in range(8):
        for j in range(8):
            if   gray[i,j]>gray[i,j+1]:
                hash_str=hash_str+'1'
            else:
                hash_str=hash_str+'0'
    return hash_str
 
#Hash值对比
def cmpHash(hash1,hash2):
    n=0
    #hash长度不同则返回-1代表传参出错
    if len(hash1)!=len(hash2):
        return -1
    #遍历判断
    for i in range(len(hash1)):
        #不相等则n计数+1,n最终为相似度
        if hash1[i]!=hash2[i]:
            n=n+1
    return n
 
img1=cv2.imread('lenna.png')
img2=cv2.imread('lenna_noise.png')
hash1= aHash(img1)
hash2= aHash(img2)
print(hash1)
print(hash2)
n=cmpHash(hash1,hash2)
print('均值哈希算法相似度:',n)
 
hash1= dHash(img1)
hash2= dHash(img2)
print(hash1)
print(hash2)
n=cmpHash(hash1,hash2)
print('差值哈希算法相似度:',n)

insert image description here

Comparison of three algorithms:

  • aHash: Mean hash. Faster, but sometimes less accurate.
  • pHash: Perceptual hashing. Accuracy is higher, but speed is worse
  • dHash: difference hash. The accuracy is high, and the speed is also very fast.

material generation

import cv2 as cv
import numpy as np
from PIL import Image
import os.path as path
from PIL import ImageEnhance


def rotate(image):
    def rotate_bound(image, angle):
        # grab the dimensions of the image and then determine the
        # center
        (h, w) = image.shape[:2]
        (cX, cY) = (w // 2, h // 2)

        # grab the rotation matrix (applying the negative of the
        # angle to rotate clockwise), then grab the sine and cosine
        # (i.e., the rotation components of the matrix)
        M = cv.getRotationMatrix2D((cX, cY), -angle, 1.0)
        cos = np.abs(M[0, 0])
        sin = np.abs(M[0, 1])

        # compute the new bounding dimensions of the image
        nW = int((h * sin) + (w * cos))
        nH = int((h * cos) + (w * sin))

        # adjust the rotation matrix to take into account translation
        M[0, 2] += (nW / 2) - cX
        M[1, 2] += (nH / 2) - cY

        # perform the actual rotation and return the image
        return cv.warpAffine(image, M, (nW, nH))

    return rotate_bound(image, 45)


def enhance_color(image):
    enh_col = ImageEnhance.Color(image)
    color = 1.5
    return enh_col.enhance(color)


def blur(image):
    # 模糊操作
    return cv.blur(image, (15, 1))


def sharp(image):
    # 锐化操作
    kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]], np.float32)
    return cv.filter2D(image, -1, kernel=kernel)


def contrast(image):
    def contrast_brightness_image(src1, a, g):
        """
        粗略的调节对比度和亮度
        :param src1: 图片
        :param a: 对比度
        :param g: 亮度
        :return:
        """

        # 获取shape的数值,height和width、通道
        h, w, ch = src1.shape

        # 新建全零图片数组src2,将height和width,类型设置为原图片的通道类型(色素全为零,输出为全黑图片)
        src2 = np.zeros([h, w, ch], src1.dtype)
        # addWeighted函数说明如下
        return cv.addWeighted(src1, a, src2, 1 - a, g)

    return contrast_brightness_image(image, 1.2, 1)


def resize(image):
    # 缩放图片
    return cv.resize(image, (0, 0), fx=1.25, fy=1)


def light(image):
    # 修改图片的亮度
    return np.uint8(np.clip((1.3 * image + 10), 0, 255))


def save_img(image, img_name, output_path=None):
    # 保存图片
    cv.imwrite(path.join(output_path, img_name), image, [int(cv.IMWRITE_JPEG_QUALITY), 70])
    pass


def show_img(image):
    cv.imshow('image', image)
    cv.waitKey(0)
    pass


def main():
    data_img_name = 'lenna.png'
    output_path = "./source"
    data_path = path.join(output_path, data_img_name)

    img = cv.imread(data_path)

    # 修改图片的亮度
    img_light = light(img)
    # 修改图片的大小
    img_resize = resize(img)
    # 修改图片的对比度
    img_contrast = contrast(img)
    # 锐化
    img_sharp = sharp(img)
    # 模糊
    img_blur = blur(img)
    # 色度增强
    img_color = enhance_color(Image.open(data_path))
    # 旋转
    img_rotate = rotate(img)
    img_rotate1 = Image.open(data_path).rotate(45)
    # 两张图片横向合并(便于对比显示)
    # tmp = np.hstack((img, img_rotate))

    save_img(img_light, "%s_light.jpg" % data_img_name.split(".")[0], output_path)
    save_img(img_resize, "%s_resize.jpg" % data_img_name.split(".")[0], output_path)
    save_img(img_contrast, "%s_contrast.jpg" % data_img_name.split(".")[0], output_path)
    save_img(img_sharp, "%s_sharp.jpg" % data_img_name.split(".")[0], output_path)
    save_img(img_blur, "%s_blur.jpg" % data_img_name.split(".")[0], output_path)
    # save_img(img_rotate, "%s_rotate.jpg" % data_img_name.split(".")[0], output_path)
    # 色度增强
    img_color.save(path.join(output_path, "%s_color.jpg" % data_img_name.split(".")[0]))
    img_rotate1.save(path.join(output_path, "%s_rotate.jpg" % data_img_name.split(".")[0]))

    show_img(img_rotate)
    pass


if __name__ == '__main__':
    main()

Algorithm comparison code

import cv2
import numpy as np
import time
import os.path as path


def aHash(img, width=8, high=8):
    """
    均值哈希算法
    :param img: 图像数据
    :param width: 图像缩放的宽度
    :param high: 图像缩放的高度
    :return:感知哈希序列
    """
    # 缩放为8*8
    img = cv2.resize(img, (width, high), interpolation=cv2.INTER_CUBIC)
    # 转换为灰度图
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # s为像素和初值为0,hash_str为hash值初值为''
    s = 0
    hash_str = ''
    # 遍历累加求像素和
    for i in range(8):
        for j in range(8):
            s = s + gray[i, j]

    # 求平均灰度
    avg = s / 64
    # 灰度大于平均值为1相反为0生成图片的hash值
    for i in range(8):
        for j in range(8):
            if gray[i, j] > avg:
                hash_str = hash_str + '1'
            else:
                hash_str = hash_str + '0'
    return hash_str


def dHash(img, width=9, high=8):
    """
    差值感知算法
    :param img:图像数据
    :param width:图像缩放后的宽度
    :param high: 图像缩放后的高度
    :return:感知哈希序列
    """
    # 缩放8*8
    img = cv2.resize(img, (width, high), interpolation=cv2.INTER_CUBIC)
    # 转换灰度图
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    hash_str = ''
    # 每行前一个像素大于后一个像素为1,反之置为0,生成感知哈希序列(string)
    for i in range(high):
        for j in range(high):
            if gray[i, j] > gray[i, j + 1]:
                hash_str = hash_str + '1'
            else:
                hash_str = hash_str + '0'
    return hash_str


def cmp_hash(hash1, hash2):
    """
    Hash值对比
    :param hash1: 感知哈希序列1
    :param hash2: 感知哈希序列2
    :return: 返回相似度
    """
    n = 0
    # hash长度不同则返回-1代表传参出错
    if len(hash1) != len(hash2):
        return -1
    # 遍历判断
    for i in range(len(hash1)):
        # 不相等则n计数+1,n最终为相似度
        if hash1[i] != hash2[i]:
            n = n + 1

    return 1 - n / len(hash2)


def pHash(img_file, width=64, high=64):
    """
    感知哈希算法
    :param img_file: 图像数据
    :param width: 图像缩放后的宽度
    :param high:图像缩放后的高度
    :return:图像感知哈希序列
    """
    # 加载并调整图片为32x32灰度图片
    img = cv2.imread(img_file, 0)
    img = cv2.resize(img, (width, high), interpolation=cv2.INTER_CUBIC)

    # 创建二维列表
    h, w = img.shape[:2]
    vis0 = np.zeros((h, w), np.float32)
    vis0[:h, :w] = img  # 填充数据

    # 二维Dct变换
    vis1 = cv2.dct(cv2.dct(vis0))
    vis1.resize(32, 32)

    # 把二维list变成一维list
    img_list = vis1.flatten()

    # 计算均值
    avg = sum(img_list) * 1. / len(img_list)
    avg_list = ['0' if i > avg else '1' for i in img_list]

    # 得到哈希值
    return ''.join(['%x' % int(''.join(avg_list[x:x + 4]), 2) for x in range(0, 32 * 32, 4)])


def hamming_dist(s1, s2):
    return 1 - sum([ch1 != ch2 for ch1, ch2 in zip(s1, s2)]) * 1. / (32 * 32 / 4)
    


def concat_info(type_str, score, time):
    temp = '%s相似度:%.2f %% -----time=%.4f ms' % (type_str, score * 100, time)
    print(temp)
    return temp


def test_diff_hash(img1_path, img2_path, loops=1000):
    img1 = cv2.imread(img1_path)
    img2 = cv2.imread(img2_path)
    start_time = time.time()

    for _ in range(loops):
        hash1 = dHash(img1)
        hash2 = dHash(img2)
        cmp_hash(hash1, hash2)

    print(">>> 执行%s次耗费的时间为%.4f s." % (loops, time.time() - start_time))


def test_aHash(img1, img2):
    time1 = time.time()
    hash1 = aHash(img1)
    hash2 = aHash(img2)
    n = cmp_hash(hash1, hash2)
    return concat_info("均值哈希算法", n, time.time() - time1) + "\n"


def test_dHash(img1, img2):
    time1 = time.time()
    hash1 = dHash(img1)
    hash2 = dHash(img2)
    n = cmp_hash(hash1, hash2)
    return concat_info("差值哈希算法", n, time.time() - time1) + "\n"


def test_pHash(img1_path, img2_path):
    time1 = time.time()
    hash1 = pHash(img1_path)
    hash2 = pHash(img2_path)
    n = hamming_dist(hash1, hash2)
    return concat_info("感知哈希算法", n, time.time() - time1) + "\n"


def deal(img1_path, img2_path):
    info = ''

    img1 = cv2.imread(img1_path)
    img2 = cv2.imread(img2_path)

    # 计算图像哈希相似度
    info = info + test_aHash(img1, img2)
    info = info + test_dHash(img1, img2)
    info = info + test_pHash(img1_path, img2_path)
    return info


def contact_path(file_name):
    output_path = "./source"
    return path.join(output_path, file_name)


def main():
    data_img_name = 'lenna.png'
    data_img_name_base = data_img_name.split(".")[0]

    base = contact_path(data_img_name)
    light = contact_path("%s_light.jpg" % data_img_name_base)
    resize = contact_path("%s_resize.jpg" % data_img_name_base)
    contrast = contact_path("%s_contrast.jpg" % data_img_name_base)
    sharp = contact_path("%s_sharp.jpg" % data_img_name_base)
    blur = contact_path("%s_blur.jpg" % data_img_name_base)
    color = contact_path("%s_color.jpg" % data_img_name_base)
    rotate = contact_path("%s_rotate.jpg" % data_img_name_base)

    # 测试算法的效率
    test_diff_hash(base, base)
    test_diff_hash(base, light)
    test_diff_hash(base, resize)
    test_diff_hash(base, contrast)
    test_diff_hash(base, sharp)
    test_diff_hash(base, blur)
    test_diff_hash(base, color)
    test_diff_hash(base, rotate)
    
    # 测试算法的精度(以base和light为例)
    deal(base, light)
    

if __name__ == '__main__':
    main()


result:
insert image description here

Extension – Discrete Cosine Transform DCT

  • Discrete Cosine Transform (Discrete Cosine Transform), mainly used for data or image compression, can convert spatial domain signals to frequency domain, and has good decorrelation performance.
  • The DCT transformation itself is lossless. At the same time, since the DCT transformation is symmetrical, we can use the DCT inverse transformation after quantization and encoding to restore the original image information at the receiving end.
  • DCT transformation has a very wide range of uses in the current image analysis and compression fields. DCT transformation is used in our common JPEG static image coding and MJPEG, MPEG dynamic coding and other standards.
    insert image description here
    in,
  • F(u,v) is the output transformation result;
  • N is the number of points of the original signal.
  • f(i,j) is the pixel value of the pixel point (i,j) in the original image;
  • c(u), c(v) are DCT coefficients.

Guess you like

Origin blog.csdn.net/m0_63260018/article/details/131224776