KNN(K-Nearest Neighbor)算法是机器学习算法中最基础、最简单的算法之一。它既能用于分类,也能用于回归。KNN通过测量不同特征值之间的距离来进行分类。
KNN算法的思想非常简单:对于任意n维输入向量,分别对应于特征空间中的一个点,输出为该特征向量所对应的类别标签或预测值。
KNN算法是一种非常特别的机器学习算法,因为它没有一般意义上的学习过程。它的工作原理是利用训练数据对特征向量空间进行划分,并将划分结果作为最终算法模型。存在一个样本数据集合,也称作训练样本集,并且样本集中的每个数据都存在标签,即我们知道样本集中每一数据与所属分类的对应关系。
import cv2
import matplotlib.pyplot as plt
import numpy as np
img = cv2.imread("char.png")
plt.imshow(img)
grayImg = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
plt.imshow(grayImg)
blurImg = cv2.GaussianBlur(grayImg, (3, 3), 0)
plt.imshow(blurImg)
thresh,binImg = cv2.threshold(blurImg,0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
print(thresh)
plt.imshow(binImg)
contours,hierarchy = cv2.findContours(binImg,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(img, contours, -1, (0, 0, 255), 1)
plt.imshow(img)
ValidChars = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28,29, 30, 31, 32, 33, 34, 35 ]
img_width = 20
img_height = 30
train_data = np.empty((0, img_width * img_height), dtype=np.float32)
train_data = train_data.reshape(1,-1)
train_label = np.empty((0, 1), dtype=np.int32)
for char_val in ValidChars:
train_label = np.append(train_label, char_val)
for i in range(5):
train_label = np.append(train_label,train_label)
print("label",train_label)
for contour in contours:
area = cv2.contourArea(contour)
if(area>10):
print("area:",area)
x,y,w,h = cv2.boundingRect(contour)
rect = (x,y,w,h)
train_chars = cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
roi = img[y:y + h, x:x + w]
# plt.imshow(roi)
resize_roi = cv2.resize(roi,(img_width,img_height))
roi_float = resize_roi.astype(np.float32)
#plt.show(resize_roi)
train_data = np.append(train_data,roi_float.reshape(1, -1))
print("train_data",train_data)
print("reshape roi",roi_float.reshape(1, -1))
plt.imshow(train_chars)
knn_model = cv2.ml.KNearest_create()
#设置训练参数
knn_model.setDefaultK(3)
#设置分类器模型
train_label = train_label.astype(np.int32)
#train_label = train_label.reshape(-1,1)
print(train_data)
knn_model.train(train_data,cv2.ml.COL_SAMPLE,train_label)
knn_model.save("knn_model.xml")