[Python 13] Computer Vision: Basic Image Processing


This blog also has multiple super-detailed overviews, and interested friends can move to:

Convolutional Neural Networks: A Super Detailed Introduction to Convolutional Neural Networks

Object Detection: Object Detection Super Detailed Introduction

Semantic Segmentation: A Super Detailed Introduction to Semantic Segmentation

NMS: Let you understand and see the whole NMS and its variants

Data Augmentation: An article to understand data augmentation in computer vision

Loss function: loss function and evaluation index in classification detection segmentation

Transformer:A Survey of Visual Transformers

Machine Learning Practical Series: Decision Trees

YOLO series:v1v2v3v4scaled-v4v5v6v7yolofyoloxyolosyolop


1 Basic image manipulation and processing

1.1 PIL: Python image processing library

PIL (Python Imaging Library, image processing library) provides general image processing functions, as well as a large number of useful basic image operations. The PIL library has been integrated in the Anaconda library. It is recommended to use Anaconda. It is simple and convenient, and the commonly used libraries have been integrated.

A brief tutorial on PIL

  • Read in an image:
from PIL import Image
from pylab import *

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)
figure()

pil_im = Image.open('E:\python\Python Computer Vision\Image data\empire.jpg')
gray()
subplot(121)
title(u'原图',fontproperties=font)
axis('off')
imshow(pil_im)

pil_im = Image.open('E:\python\Python Computer Vision\Image data\empire.jpg').convert('L')
subplot(122)
title(u'灰度图',fontproperties=font)
axis('off')
imshow(pil_im)

show()

Write picture description here

1.1.1 Convert image format - save()function

from PCV.tools.imtools import get_imlist #导入原书的PCV模块
from PIL import Image
import os
import pickle

filelist = get_imlist('E:/python/Python Computer Vision/test jpg/') #获取convert_images_format_test文件夹下的图片文件名(包括后缀名)
imlist = open('E:/python/Python Computer Vision/test jpg/imlist.txt','wb+')
#将获取的图片文件列表保存到imlist.txt中
pickle.dump(filelist,imlist) #序列化
imlist.close()

for infile in filelist:
    outfile = os.path.splitext(infile)[0] + ".png" #分离文件名与扩展名
    if infile != outfile:
        try:
            Image.open(infile).save(outfile)
        except IOError:
            print ("cannot convert", infile)

Among them, the test jpg folder is a folder created by the author to store the tested **.jpg images. Some codes are added to the source code certificate to save the obtained image file names and convert all images to .png format. , the result after running the program is as follows:
Write picture description here
<img src="https://img-blog.csdn.net/20180306084511595?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvamlhb3lhbmd3bQ==/font/5a6L5L2T/fontsize/400/fill/I0 JBQkFCMA== /dissolve/70"width="47.5%",alt=""/>
Write picture description here

The open() function in PIL is used to create a PIL image object, and the sace() method is used to save the following to a folder with a specified file name. The above process changes the suffix to .png, but the file name remains unchanged.

1.1.2 Create thumbnails

Using PIL, you can easily create a thumbnail, set the size of the thumbnail, save it in a tuple, and call the thumnail()method to generate the thumbnail. The code to create the thumbnail is below.
For example to create a thumbnail with a longest side of 128 pixels, you can use:

pil_im.thumbnail((128,128))

####1.1.3 Copy and paste the image area
Call the crop() method to copy the area from an image. After copying the area, you can perform rotation and other transformations on the area.

box=(100,100,400,400)
region=pil_im.crop(box)

The target area is specified by a quadruple, and the coordinates are (left, top, right, bottom). The coordinates of the upper left corner of the specified coordinate system in PIL are (0, 0), which can be rotated and put back with paste(). The specific implementation as follows:

region=region.transpose(Image.ROTATE_180)
pil_im.paste(region,box)

1.1.4 Resizing and Rotating

  • Resize: Utilize resize()the method, the parameter is a tuple specifying the size of the new image:
out=pil_im.resize((128,128))
  • Rotation: use rotate()the method to represent the angle counterclockwise
out=pil_im.rotate(45)

The code for the above operation is as follows:

from PIL import Image
from pylab import *

# 添加中文字体支持
from matplotlib.font_manager import FontProperties

font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)
figure()

# 显示原图
pil_im = Image.open('E:/python/Python Computer Vision/Image data/empire.jpg')
print(pil_im.mode, pil_im.size, pil_im.format)
subplot(231)
title(u'原图', fontproperties=font)
axis('off')
imshow(pil_im)

# 显示灰度图
pil_im = Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L')
gray()
subplot(232)
title(u'灰度图', fontproperties=font)
axis('off')
imshow(pil_im)

# 复制并粘贴区域
pil_im = Image.open('E:/python/Python Computer Vision/Image data/empire.jpg')
box = (100, 100, 400, 400)
region = pil_im.crop(box)
region = region.transpose(Image.ROTATE_180)
pil_im.paste(region, box)
subplot(233)
title(u'复制粘贴区域', fontproperties=font)
axis('off')
imshow(pil_im)

# 缩略图
pil_im = Image.open('E:/python/Python Computer Vision/Image data/empire.jpg')
size = 128, 128
pil_im.thumbnail(size)
print(pil_im.size)
subplot(234)
title(u'缩略图', fontproperties=font)
axis('off')
imshow(pil_im)
pil_im.save('E:/python/Python Computer Vision/Image data/empire thumbnail.jpg')# 保存缩略图

#调整图像尺寸
pil_im=Image.open('E:/python/Python Computer Vision/Image data/empire thumbnail.jpg')
pil_im=pil_im.resize(size)
print(pil_im.size)
subplot(235)
title(u'调整尺寸后的图像',fontproperties=font)
axis('off')
imshow(pil_im)

#旋转图像45°
pil_im=Image.open('E:/python/Python Computer Vision/Image data/empire thumbnail.jpg')
pil_im=pil_im.rotate(45)
subplot(236)
title(u'旋转45°后的图像',fontproperties=font)
axis('off')
imshow(pil_im)

show()

The result of the operation is as follows:
Write picture description here

1.2 Matplotlib library

When dealing with mathematics and graphics or plotting points, drawing lines, and curves on images, Matplotlib is a good graphics library that provides more powerful features than the PIL library.

matplotlib tutorial

1.2.1 Drawing, Plotting Points and Lines

from PIL import Image
from pylab import *

# 添加中文字体支持
from matplotlib.font_manager import FontProperties

font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

# 读取图像到数组中
im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg'))
figure()

# 绘制有坐标轴的
subplot(121)
imshow(im)
x = [100, 100, 400, 400]
y = [200, 500, 200, 500]

# 使用红色星状标记绘制点
plot(x, y, 'r*')


# 绘制连接两个点的线(默认为蓝色)
plot(x[:2], y[:2])
title(u'绘制empire.jpg', fontproperties=font)

# 不显示坐标轴的
subplot(122)
imshow(im)
x = [100, 100, 400, 400]
y = [200, 500, 200, 500]

plot(x, y, 'r*')
plot(x[:2], y[:2])
axis('off')
title(u'绘制empire.jpg', fontproperties=font)

show()
# show()命令首先打开图形用户界面(GUI),然后新建一个窗口,该图形用户界面会循环阻断脚本,然后暂停,
# 直到最后一个图像窗口关闭。每个脚本里,只能调用一次show()命令,通常相似脚本的结尾调用。

Write picture description here

There are many optional colors and styles when drawing, as shown in Table 1-1, 1-2, 1-3, and the application routine is as follows:

plot(x,y)          #默认为蓝色实线
plot(x,y,'go-')    #带有圆圈标记的绿线
plot(x,y,'ks:')    #带有正方形标记的黑色虚线
Table 1-1 Basic color formatting commands for drawing with the PyLab library
symbol color
‘b’ blue
‘g’ green
‘r’ red
‘c’ blue
‘m’ magenta
‘y’ yellow
‘k’ black
‘w’ White
Table 1-2 Basic line formatting commands for drawing with the PyLab library
symbol|linetype:-|:-'-'|solid'--'|dotted':'|dotted
Table 1-3 Basic drawing markup formatting commands for drawing with the PyLab library
symbol | mark :-|:-'.' | dot 'o' | circle 's' | square '*' | star '+' | plus sign '*' | cross

1.2.2 Image contour and histogram

from PIL import Image
from pylab import *

# 添加中文字体支持
from matplotlib.font_manager import FontProperties

font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)
# 打开图像,并转成灰度图像
im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))

# 新建一个图像
figure()
subplot(121)
# 不使用颜色信息
gray()
# 在原点的左上角显示轮廓图像
contour(im, origin='image')
axis('equal')
axis('off')
title(u'图像轮廓图', fontproperties=font)

subplot(122)
# 利用hist来绘制直方图
# 第一个参数为一个一维数组
# 因为hist只接受一维数组作为输入,所以要用flatten()方法将任意数组按照行优先准则转化成一个一维数组
# 第二个参数指定bin的个数
hist(im.flatten(), 128)
title(u'图像直方图', fontproperties=font)
# plt.xlim([0,250])
# plt.ylim([0,12000])

show()

Write picture description here

1.2.3 Interactive annotation

Sometimes users need to interact with the application, such as marking images with dots, or annotating some training data. PyLab provides a very simple and easy-to-use function gitput() to achieve interactive annotation.

from PIL import Image
from pylab import *

im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg'))
imshow(im)

print('Please click 3 points')
x = ginput(3)
print('you clicked:', x)
show()

output:

you clicked: 
[(118.4632306896458, 177.58271393177051), 
(118.4632306896458, 177.58271393177051),
(118.4632306896458, 177.58271393177051)]

The above code first reads the empire.jpg image, displays the read image, and then uses ginput() to interactively annotate. The interactive annotation data points set here are set to 3. After the user annotates, the coordinates of the annotation points will be printed out.

1.3 NumPy library

NumPy Online Documentation
NumPy is a popular Python package for scientific computing. It contains many other very useful objects such as vectors, matrices, images, and linear algebra functions.

1.3.1 Image Array Representation

In the previous image example, we converted the image to a NumPy array object using the array() function, but did not mention what it means. An array is like a list, except that it stipulates that all elements in the array must be of the same type, unless otherwise specified, the data type is automatically determined according to the data type.
Examples are as follows:

from PIL import Image
from pylab import *

im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg'))
print (im.shape, im.dtype)
im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'),'f')
print (im.shape, im.dtype)

output:

(800, 569, 3) uint8
(800, 569) float32

explain:

The first tuple indicates the image array size (row, column, color channel) and
the second string indicates the data type of the array elements, because images are usually encoded as 8-bit unsigned integers;

  1. uint8: default type
  2. float32: Grayscale the image and add the 'f' parameter, so it becomes a floating point type
  • How to access array elements - using subscript access
value=im[i,j,k]
  • How to send multiple array elements to me—use the array slice method to access, and return the element value of the array accessed by the subscript at the specified interval
im[i,:] = im[j,:]     #将第j行的数值赋值给第i行
im[:,j] = 100         #将第i列所有数值设为100
im[:100,:50].sum()    #计算前100行、前50列所有数值的和
im[50:100,50:100]     #50~100行,50~100列,不包含第100行和100列
im[i].mean()          #第i行所有数值的平均值
im[:,-1]              #最后一列
im[-2,:]/im[-2]       #倒数第二行

1.3.2 Gray scale transformation

After reading images into NumPy array objects, we can perform arbitrary mathematical operations on them, a simple example is the grayscale transformation of an image, consider an arbitrary function fff , it maps 0~255 to itself, that is, the output interval is the same as the input interval.
Examples are as follows:

from PIL import Image
from numpy import *
from pylab import *

im=array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))
print(int(im.min()),int(im.max()))

im2=255-im               #对图像进行反向处理
print(int(im2.min()),int(im2.max())) #查看最大/最小元素

im3=(100.0/255)*im+100   #将图像像素值变换到100...200区间
print(int(im3.min()),int(im3.max()))

im4=255.0*(im/255.0)**2  #对像素值求平方后得到的图像
print(int(im4.min()),int(im4.max()))

figure()
gray()
subplot(131)
imshow(im2)
axis('off')
title(r'$f(x)=255-x$')

subplot(132)
imshow(im3)
axis('off')
title(r'$f(x)=\frac{100}{255}x+100$')

subplot(133)
imshow(im4)
axis('off')
title(r'$f(x)=255(\frac{x}{255})^2$')

show()

output:

3 255
0 252
101 200
0 255

Write picture description here

  • The reverse operation of array transformation can be done using PIL's fromarray() function
pil_im=Image.fromarray(im)
  • If the previous operation converted the "uint8" data type to another type, you need to convert the data type back before creating the PIL image:
pil_im=Image.fromarray(uint8(im))

1.3.3 Image scaling

NumPy arrays will be our primary tool for manipulating images and data, but there is no easy way to resize matrices. We can write a simple image resizing function using the PIL image object transformation:

def imresize(im,sz):
    """    Resize an image array using PIL. """
    pil_im = Image.fromarray(uint8(im))

    return array(pil_im.resize(sz))

The adjustment function defined above, you can find it in imtools.py.

1.3.4 Histogram equalization

Histogram equalization refers to flattening the gray histogram of an image so that the distribution probability of each gray value in the transformed image is the same. This method is a good method for normalizing the gray value. And can enhance the contrast of the image.

  • Transformation function: the cumulative distribution function (cdf) of the pixel values ​​in the image, the normalization operation that maps the range of pixel values ​​to the target range

The following function is a concrete implementation of histogram equalization:

def histeq(im,nbr_bins=256):
  """ 对一幅灰度图像进行直方图均衡化"""

  # 计算图像的直方图
  imhist,bins = histogram(im.flatten(),nbr_bins,normed=True)
  cdf = imhist.cumsum()      # 累积分布函数
  cdf = 255 * cdf / cdf[-1]  # 归一化
  # 此处使用到累积分布函数cdf的最后一个元素(下标为-1),其目的是将其归一化到0~1范围
  
  # 使用累积分布函数的线性插值,计算新的像素值
  im2 = interp(im.flatten(),bins[:-1],cdf)

  return im2.reshape(im.shape), cdf

explain:

  1. This function has two parameters
  • Grayscale image
  • The number of bins used in the histogram
  1. function return value
  • equalized image
  • Cumulative distribution function for pixel value mapping

Program implementation:

from PIL import Image
from pylab import *
from PCV.tools import imtools

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))
# 打开图像,并转成灰度图像
#im = array(Image.open('../data/AquaTermi_lowcontrast.JPG').convert('L'))
im2, cdf = imtools.histeq(im)

figure()
subplot(2, 2, 1)
axis('off')
gray()
title(u'原始图像', fontproperties=font)
imshow(im)

subplot(2, 2, 2)
axis('off')
title(u'直方图均衡化后的图像', fontproperties=font)
imshow(im2)

subplot(2, 2, 3)
axis('off')
title(u'原始直方图', fontproperties=font)
#hist(im.flatten(), 128, cumulative=True, normed=True)
hist(im.flatten(), 128, normed=True)

subplot(2, 2, 4)
axis('off')
title(u'均衡化后的直方图', fontproperties=font)
#hist(im2.flatten(), 128, cumulative=True, normed=True)
hist(im2.flatten(), 128, normed=True)

show()

result:
Write picture description here

Write picture description here

1.3.5 Image Averaging

Averaging an image is a simple method of image noise reduction and is often used to produce artistic effects. Assuming that all images have the same size, we can average the pixels in the same position of the image. Here is an example that demonstrates the average of images:

def compute_average(imlist):
  """ 计算图像列表的平均图像"""

  # 打开第一幅图像,将其存储在浮点型数组中
  averageim = array(Image.open(imlist[0]), 'f')

  for imname in imlist[1:]:
    try:
      averageim += array(Image.open(imname))
    except:
      print imname + '...skipped'
  averageim /= len(imlist)

  # 返回uint8 类型的平均图像
  return array(averageim, 'uint8')

Write picture description here
Note: It is possible that some images cannot be opened and the average result is only the average of a certain image or two images

1.3.6 Perform principal component analysis on images

PCA (Principal Component Analysis) is a very useful dimensionality reduction technique. It is an optimal technique in the sense that it can preserve as much information as possible from the training data while using as few dimensions as possible. Even a small grayscale image of 100×100 pixels has 10,000 dimensions and can be regarded as a point in the 10,000-dimensional space. A megapixel image has millions of dimensions. Due to the high dimensionality of images, we often use dimensionality reduction operations in many computer vision applications. The projection matrix generated by PCA can be regarded as transforming the original coordinates into the existing coordinate system, and each coordinate in the coordinate system is arranged in descending order of importance.

In order to perform PCA transformation on image data, the image needs to be converted into a one-dimensional vector representation. We can use the method in the NumPy class library flatten()to perform the transformation.

By stacking the flattened images, we can get a matrix, one row of the matrix represents an image. All row images are centered by the mean image before computing the principal directions. We usually use SVD (Singular Value Decomposition, singular value decomposition) method to calculate the principal components; but when the dimension of the matrix is ​​large, the calculation of SVD is very slow, so SVD decomposition is usually not used at this time.

Here is the code for the PCA operation:

from PIL import Image
from numpy import *

def pca(X):
  """ 主成分分析:
    输入:矩阵X ,其中该矩阵中存储训练数据,每一行为一条训练数据
    返回:投影矩阵(按照维度的重要性排序)、方差和均值"""

  # 获取维数
  num_data,dim = X.shape

  # 数据中心化
  mean_X = X.mean(axis=0)
  X = X - mean_X

if dim>num_data:
  # PCA- 使用紧致技巧
  M = dot(X,X.T) # 协方差矩阵
  e,EV = linalg.eigh(M) # 特征值和特征向量
  tmp = dot(X.T,EV).T # 这就是紧致技巧
  V = tmp[::-1] # 由于最后的特征向量是我们所需要的,所以需要将其逆转
  S = sqrt(e)[::-1] # 由于特征值是按照递增顺序排列的,所以需要将其逆转
  for i in range(V.shape[1]):
    V[:,i] /= S
else:
  # PCA- 使用SVD 方法
  U,S,V = linalg.svd(X)
  V = V[:num_data] # 仅仅返回前nun_data 维的数据才合理

# 返回投影矩阵、方差和均值
return V,S,mean_X

The function first centers the data by subtracting the mean of each dimension, and then computes the eigenvector corresponding to the largest eigenvalue of the covariance matrix, which can be done using concise tricks or SVD decomposition. Here we use the range() function, the input parameter of this function is an integer n, and the function returns a list of integers 0...(n-1). You can also use the arange() function to return an array, or the xrange() function to return a generator (probably for speed). We use the range() function throughout this book.

If the number of data is less than the dimension of the vector, instead of SVD decomposition, we calculate the eigenvectors of the covariance matrix XXT with smaller dimension. The above PCA operation can be made faster by computing only the eigenvectors corresponding to the top k (k is the reduced dimensionality) largest eigenvalues. Due to space limitations, interested readers can explore by themselves. Each row of vectors in matrix V is orthogonal and contains coordinate directions in which the variance of the training data decreases in turn.

We next perform a PCA transformation on the font image. The fontimages.zip file contains thumbnail images of the character a in different fonts. All 2359 fonts can be downloaded for free2. Assuming that the names of these images are stored in the list imlist, which is saved together with the previous code in the pca.py file, we can use the following script to calculate the principal components of the image:

import pickle
from PIL import Image
from numpy import *
from pylab import *
from PCV.tools import imtools,pca

# Uses sparse pca codepath

# 获取图像列表和尺寸
imlist=imtools.get_imlist('E:/python/Python Computer Vision/Image data/fontimages/a_thumbs')
# open ont image to get the size
im=array(Image.open(imlist[0]))
# get the size of the images
m,n=im.shape[:2]
# get the number of images
imnbr=len(imlist)
print("The number of images is %d" % imnbr)

# create matrix to store all flattened images
immatrix = array([array(Image.open(imname)).flatten() for imname in imlist],'f')

# PCA降维
V,S,immean=pca.pca(immatrix)

# 保存均值和主成分
#f = open('../ch01/font_pca_modes.pkl', 'wb')
#pickle.dump(immean,f)
#pickle.dump(V,f)
#f.close()

# Show the images (mean and 7 first modes)
# This gives figure 1-8 (p15) in the book.

figure()
gray()
subplot(241)
axis('off')
imshow(immean.reshape(m,n))
for i in range(7):
    subplot(2,4,i+2)
    imshow(V[i].reshape(m,n))
    axis('off')

show()

Note that after these images have been pulled into a one-dimensional representation, they must be transformed back using the reshape() function. Run the above code, you can get the results in the original book P15 Figure 1-8, namely:
Write picture description here

1.3.7 Pickle module

The module in Python pickleis very useful if you want to save some results or data for later use. pickleModules can accept almost any Python object and convert it into a string representation, a process called pickling. Reconstructing the object from its string representation is called unpickling. These string representations can be conveniently stored and transmitted.

Let's look at an example. Assuming you want to save the mean image and principal components of the font images from the previous section, this can be done like this:

# 保存均值和主成分数据
f = open('font_pca_modes.pkl','wb')
pickle.dump(immean,f)
pickle.dump(V,f)
f.close()

In the above example, many objects can be saved to the same file. pickleThere are many different protocols in the module that can generate .pklfiles; if you are not sure, it is best to read and write as binary files. To load data in other Python sessions, just use load()the method as follows:

# 载入均值和主成分数据
f = open('font_pca_modes.pkl','rb')
immean = pickle.load(f)
V = pickle.load(f)
f.close()

Note that objects must be loaded in the same order as they were previously saved. There is an optimized version of Python written in C called cpicklethe module, which is picklefully compatible with the standard module. For more information about the pickle module, see the pickle module documentation page at http://docs.python.org/library/pickle.html.

In the rest of this book, we'll use the with statement to handle file reads and writes. This is an idea introduced in Python 2.5 to automatically open and close files (even if errors occur while opening the file). The following example uses with()to implement save and load operations:

# 打开文件并保存
with open('font_pca_modes.pkl', 'wb') as f:
  pickle.dump(immean,f)
  pickle.dump(V,f)

and

# 打开文件并载入
with open('font_pca_modes.pkl', 'rb') as f:
  immean = pickle.load(f)
  V = pickle.load(f)

The above example may seem strange at first, but with() is a very useful idea. If you don't like it, you can use the previous open and close functions.

As picklean alternative to , NumPy has simple functions for reading and writing text files. NumPy's read and write functions are useful if the data does not contain complex data structures, such as a list of points clicked on an image. To save an array x to a file, use:

savetxt('test.txt',x,'%i')

The last parameter indicates that integer format should be used. Similarly, reading can be done using:

x = loadtxt('test.txt')

You can learn more from the online documentation

Finally, NumPy has dedicated functions for saving and loading arrays, see more about save()and in the online documentation.load()

1.4 SciPy

SciPy (http://scipy.org/) is an open source toolkit for numerical operations based on NumPy. SciPy provides many efficient operations for numerical integration, optimization, statistics, signal processing, and most importantly for us, image processing.

1.4.1 Blurred image

Gaussian blurring of images is a very classic example of image convolution. Essentially, image blurring is the convolution of the (grayscale) image $I$ with a Gaussian kernel:

$I_\delta=I* G_\delta$

Among them, * means convolution, G δ G_\deltaGdIndicates that the standard deviation is δ \deltaDelta convolution kernel

  • Filter operation module——scipy.ndimage.filters

This module can use fast one-dimensional separation to calculate the convolution, and the usage is as follows:

from PIL import Image
from numpy import *
from pylab import *
from scipy.ndimage import filters

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font=FontProperties(fname=r"c:\windows\fonts\SimSun.ttc",size=14)

im=array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))

figure()
gray()
axis('off')
subplot(141)
axis('off')
title(u'原图',fontproperties=font)
imshow(im)

for bi,blur in enumerate([2,5,10]):
    im2=zeros(im.shape)
    im2=filters.gaussian_filter(im,blur)
    im2=np.uint8(im2)
    imNum=str(blur)
    subplot(1,4,2+bi)
    axis('off')
    title(u'标准差为'+imNum,fontproperties=font)
    imshow(im2)

#如果是彩色图像,则分别对三个通道进行模糊
#for bi, blur in enumerate([2, 5, 10]):
#  im2 = zeros(im.shape)
#  for i in range(3):
#    im2[:, :, i] = filters.gaussian_filter(im[:, :, i], blur)
#  im2 = np.uint8(im2)
#  subplot(1, 4,  2 + bi)
#  axis('off')
#  imshow(im2)

show()

Write picture description here

The first picture above is the image to be blurred, the second picture is blurred with a Gaussian standard deviation of 2, the third picture is blurred with a Gaussian standard deviation of 5, and the last picture is blurred with a Gaussian standard deviation of 10. For more details on the use of this module and parameter selection, please refer to the SciPy scipy.ndimage documentation

1.4.2 Image Derivatives

The variation of image intensity is very important information in many applications. Intensity changes can be made with grayscale images III (for color images, the derivatives are usually computed separately for each color channel) ofxxx and $y$ directional derivativeI x I_xIxand $I_y$ for description.

  • The gradient vector of the image is ∇ I = [ I x , I y ] T ∇ I = [I_x, I_y]^TI=[Ix,Iy]T , describes the direction in which the intensity of the image varies the most on each pixel.
  • Gradients have two important properties:
  1. Gradient magnitude:
    ∣ ∇ I ∣ = I x 2 + I y 2 |∇I| = \sqrt {I_x^2+I_y^2}∣∇I=Ix2+Iy2
  2. The direction of the gradient:
    α = arctan 2 ( I x , I y ) \alpha=arctan2(I_x, I_y)a=arctan2(Ix,Iy)

NumPyThe function in arctan2()returns the signed angle expressed in radians, and the change range of the angle is [ − π , π ] [-\pi,\pi][ π ,p ]

We can compute the derivative of the image in a discrete approximation. Image derivatives can mostly be implemented simply by convolution:
I x = I ∗ D x I_x=I*D_xIx=IDx I y = I ∗ D y I_y=I*D_y Iy=IDy
For, prewitt filters or sobel filters are usually chosen.
These derivative filters can be scipy.ndimage.filtersimplemented simply using the standard convolution operations of the module

from PIL import Image
from pylab import *
from scipy.ndimage import  filters
import numpy

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font=FontProperties(fname=r"c:\windows\fonts\SimSun.ttc",size=14)

im=array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))
gray()

subplot(141)
axis('off')
title(u'(a)原图',fontproperties=font)
imshow(im)

# sobel derivative filters
imx=zeros(im.shape)
filters.sobel(im,1,imx)
subplot(142)
axis('off')
title(u'(b)x方向差分',fontproperties=font)
imshow(imx)

imy=zeros(im.shape)
filters.sobel(im,0,imy)
subplot(143)
axis('off')
title(u'(c)y方向差分',fontproperties=font)
imshow(imy)

mag=255-numpy.sqrt(imx**2+imy**2)
subplot(144)
title(u'(d)梯度幅值',fontproperties=font)
axis('off')
imshow(mag)

show()

Write picture description here

Difference of Gaussian:

from PIL import Image
from pylab import *
from scipy.ndimage import filters
import numpy

# 添加中文字体支持
#from matplotlib.font_manager import FontProperties
#font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

def imx(im, sigma):
    imgx = zeros(im.shape)
    filters.gaussian_filter(im, sigma, (0, 1), imgx)
    return imgx


def imy(im, sigma):
    imgy = zeros(im.shape)
    filters.gaussian_filter(im, sigma, (1, 0), imgy)
    return imgy


def mag(im, sigma):
    # there's also gaussian_gradient_magnitude()
    #mag = numpy.sqrt(imgx**2 + imgy**2)
    imgmag = 255 - numpy.sqrt(imgx ** 2 + imgy ** 2)
    return imgmag


im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))
figure()
gray()

sigma = [2, 5, 10]

for i in  sigma:
    subplot(3, 4, 4*(sigma.index(i))+1)
    axis('off')
    imshow(im)
    imgx=imx(im, i)
    subplot(3, 4, 4*(sigma.index(i))+2)
    axis('off')
    imshow(imgx)
    imgy=imy(im, i)
    subplot(3, 4, 4*(sigma.index(i))+3)
    axis('off')
    imshow(imgy)
    imgmag=mag(im, i)
    subplot(3, 4, 4*(sigma.index(i))+4)
    axis('off')
    imshow(imgmag)

show()

Write picture description here

1.4.3 Morphology: object counting

Morphology (or mathematical morphology) is the basic framework and collection of image processing methods for measuring and analyzing basic shapes. Morphology is typically used for binary images, but can also be used for grayscale images. A binary image means that each pixel of the image can only take two values, usually 0 and 1. Binary images are usually the result of thresholding an image when counting objects, or measuring their size. An overview of morphology and how it handles images can be read at http://en.wikipedia.org/wiki/Mathematical_morphology.

scipy.ndimageThe module in morphologycan realize the morphological operation The module
scipy.ndimagein measurementscan realize the counting and measurement function of the binary image

Here is a simple example of how to use them:

from scipy.ndimage import measurements,morphology

# 载入图像,然后使用阈值化操作,以保证处理的图像为二值图像
im = array(Image.open('houses.png').convert('L'))
im = 1*(im<128)

labels, nbr_objects = measurements.label(im)
print "Number of objects:", nbr_objects
  1. The above script first loads the image and thresholds it to ensure that it is a binary image. The script converts the boolean array into a binary representation by multiplying by 1.
  2. We then use the label() function to find individual objects and assign integer labels to pixels according to which object they belong to.
  3. Figure 1-12b is an image of the labels array. The grayscale value of the image represents the label of the object. As you can see, there are some small connections between some objects. Doing a binary open, we can remove it:
# 形态学开操作更好地分离各个对象
im_open = morphology.binary_opening(im,ones((9,5)),iterations=2)

labels_open, nbr_objects_open = measurements.label(im_open)
print "Number of objects:", nbr_objects_open
  • binary_opening()The second parameter of the function specifies an array structure element.

  • This array indicates which adjacent pixels to use when centering on a pixel.

  • In this case we use 9 pixels in the y direction (4 pixels above, the pixel itself, 4 pixels below) and 5 pixels in the x direction. You can specify any array as the structure element, and the non-zero elements in the array determine which adjacent pixels are used.

  • The parameter iterations determines how many times to perform the operation. You can experiment with different iterations values ​​and see how the number of objects changes.

  • The opened image, and the corresponding label image, can be viewed in Figure 1-12c and Figure 1-12d.

  • binary_closing()Functions do the opposite.

  • We leave the use of this function and other functions in the morphology and measurements modules as an exercise. You can learn more about these functions from the scipy.ndimage module documentation .

from PIL import Image
from numpy import *
from scipy.ndimage import measurements, morphology
from pylab import *

"""   This is the morphology counting objects example in Section 1.4.  """

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

# load image and threshold to make sure it is binary
figure()
gray()
im = array(Image.open('E:/python/Python Computer Vision/Image data/houses.png').convert('L'))
subplot(221)
imshow(im)
axis('off')
title(u'原图', fontproperties=font)
im = (im < 128)

labels, nbr_objects = measurements.label(im)
print ("Number of objects:", nbr_objects)
subplot(222)
imshow(labels)
axis('off')
title(u'标记后的图', fontproperties=font)

# morphology - opening to separate objects better
im_open = morphology.binary_opening(im, ones((9, 5)), iterations=2)
subplot(223)
imshow(im_open)
axis('off')
title(u'开运算后的图像', fontproperties=font)

labels_open, nbr_objects_open = measurements.label(im_open)
print ("Number of objects:", nbr_objects_open)
subplot(224)
imshow(labels_open)
axis('off')
title(u'开运算后进行标记后的图像', fontproperties=font)

show()

output:

Number of objects: 45
Number of objects: 48

Write picture description here

1.4.4 Useful SciPy modules

SciPycontains some useful modules for input and output. Two of these modules are described below: ioandmisc

1. Read and write .mat files

If you have some data, or downloaded some interesting data sets from the Internet, these data are stored in Matlab's .mat file format, then you can use the scipy.io module to read them.

data = scipy.io.loadmat('test.mat')

In the code above, the data object contains a dictionary whose keys correspond to the variable names stored in the original .mat file. Since these variables are in array format, they can be conveniently saved to a .mat file. You just create a dictionary with all the variables you want to save, and use the savemat() function:

data = {}
data['x'] = x
scipy.io.savemat('test.mat',data)

Because the above script saves the array x, when it is read into Matlab, the name of the variable is still x. For more information about scipy.iothe module, see the online documentation .

2. Save the array as an image

Because we need to operate on images, and we need to use array objects to do operations, it is very useful to save the array directly as an image file4. Many of the images in this book were created this way.

imsave()Function: scipy.miscLoaded from module. To imsave an array to a file, use the following command:

from scipy.misc import imsave
imsave('test.jpg',im)

scipy.miscThe module also contains the famous Lena test image:

lena = scipy.misc.lena()

The script returns a 512x512 array of grayscale images

All Pylab plots can be saved in various image formats by clicking the "Save" button in the image window.

1.5 Advanced Example: Image Denoising

We end this chapter with a very practical example—image denoising. Image denoising is a processing technique that preserves image details and structures as much as possible while removing image noise. We use the ROF (Rudin-Osher-Fatemi) denoising model here. This model first appeared in the literature [28]. Image denoising is important for many applications; from making your vacation photos look better to improving the quality of satellite imagery. The ROF model has the nice property of making the processed image smoother while maintaining image edge and structural information.

The mathematical foundations and processing techniques of the ROF model are too advanced to be covered in this book. Before describing how to implement the ROF solver based on the algorithm proposed by Chambolle [5], this book first briefly introduces the ROF model.

Denoising synthesis example:

from pylab import *
from numpy import *
from numpy import random
from scipy.ndimage import filters
from scipy.misc import imsave
from PCV.tools import rof

""" This is the de-noising example using ROF in Section 1.5. """

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

# create synthetic image with noise
im = zeros((500,500))
im[100:400,100:400] = 128
im[200:300,200:300] = 255
im = im + 30*random.standard_normal((500,500))

U,T = rof.denoise(im,im)
G = filters.gaussian_filter(im,10)


# save the result
#imsave('synth_original.pdf',im)
#imsave('synth_rof.pdf',U)
#imsave('synth_gaussian.pdf',G)


# plot
figure()
gray()

subplot(1,3,1)
imshow(im)
#axis('equal')
axis('off')
title(u'原噪声图像', fontproperties=font)

subplot(1,3,2)
imshow(G)
#axis('equal')
axis('off')
title(u'高斯模糊后的图像', fontproperties=font)

subplot(1,3,3)
imshow(U)
#axis('equal')
axis('off')
title(u'ROF降噪后的图像', fontproperties=font)

show()

Write picture description here

The first picture shows the original noise image, the middle picture shows the result of Gaussian blur with a standard deviation of 10, and the rightmost picture is the image after ROF noise reduction. The original noise image above is a simulated image, and now we test it on a real image:

from PIL import Image
from pylab import *
from numpy import *
from numpy import random
from scipy.ndimage import filters
from scipy.misc import imsave
from PCV.tools import rof

""" This is the de-noising example using ROF in Section 1.5. """

# 添加中文字体支持
from matplotlib.font_manager import FontProperties
font = FontProperties(fname=r"c:\windows\fonts\SimSun.ttc", size=14)

im = array(Image.open('E:/python/Python Computer Vision/Image data/empire.jpg').convert('L'))

U,T = rof.denoise(im,im)
G = filters.gaussian_filter(im,10)


# save the result
#imsave('synth_original.pdf',im)
#imsave('synth_rof.pdf',U)
#imsave('synth_gaussian.pdf',G)


# plot
figure()
gray()

subplot(1,3,1)
imshow(im)
#axis('equal')
axis('off')
title(u'原噪声图像', fontproperties=font)

subplot(1,3,2)
imshow(G)
#axis('equal')
axis('off')
title(u'高斯模糊后的图像', fontproperties=font)

subplot(1,3,3)
imshow(U)
#axis('equal')
axis('off')
title(u'ROF降噪后的图像', fontproperties=font)

show()

Write picture description here
ROF noise reduction can preserve edges and image structure

1.6 Installation of PCV package

  • Download PCV library file data, download address: https://github.com/jesolem/PCV

  • Extract the downloaded file to: C:\Users\Administrator\Desktop\PCV

  • Open cmd and execute the following command:

    (1)cd C:\Users\Administrator\Desktop\PCV

    (2)python setup.py install

  • Enter in pycharm import PCVto test whether the installation is successful

If error 1 is reported: NameError: name 'file' is not defined, fp = file(filename, 'wb')change tofp = open(filename, 'wb')

If the error 2: is reported TypeError: write() argument must be str, not bytes, there is a problem with the file opening method. Just modify the previous opening statement to open in binary mode.

filelist = get_imlist('E:/python/Python Computer Vision/test jpg/') #获取convert_images_format_test文件夹下的图片文件名(包括后缀名)
imlist = open('E:/python/Python Computer Vision/test jpg/imlist.txt','wb+')

Guess you like

Origin blog.csdn.net/jiaoyangwm/article/details/79293272