Machine Learning Handwriting Recognition Project (KNN)

supervised learning

  • Classification: The main task is to divide the instance data into appropriate categories
  • Regression: predicting numerical data (data fitting curve: best fitting curve through given data points)
    Explanation: The reason why it is called supervised learning is that this type of algorithm must know what to predict, that is, the classification information of the target variable;

unsupervised learning

  • Clustering: The process of dividing a data set into multiple classes consisting of similar objects is called clustering
  • Density estimation: The process of finding statistical values ​​describing data is called density estimation

Handwritten digit recognition belongs to the category of pattern recognition, which mainly uses classification algorithms to classify and predict unknown samples. The flow chart is as follows:
insert image description here
Data collection is required in the early stage of this project, so the overall project can be divided into the following stages:
1. Data collection
2. Feature extraction
3. Training model
4. Model verification analysis and tuning
5. Classification model application

data collection

1. Data acquisition tool: use drawing software to simulate handwriting input device.
2. Description of the collection process: the size of the canvas is 40*40, the brush is used, the thickness is 4px, the color 1 is black, and the color 2 is white. As shown below:

insert image description here

3. Picture naming rules: the storage name is "number_serial number.bmp", and the final collection and naming of pictures can refer to the following figure:
insert image description here

4. The amount of collected data: 10*30, that is, collect 30 pictures for each number.

feature extraction

Since the current digital image is relatively simple, the pixel value of the grayscale image of the image can be directly used as the feature value of the digital image. Since the original image size is 40 * 40 pixels, the feature value is 1600, which is relatively large. For the digital images in this project, the writing track of each number has a certain range limit, and the image can be scaled to 8*8, that is, each sample has 64 eigenvalues.

model training

The current project uses the KNN algorithm (or other classification algorithms) for classification.

k-Nearest Neighbors linear regression
Naive Bayes locally weighted linear regression
Support Vector Machines Ridge returns
decision tree Lasso Minimum Regression Coefficient Estimation
K-means Maximum Expectation Algorithm (EM Algorithm)
DBSCAN (density-based clustering algorithm) Parzen window design

Project risk analysis and identification and optimization strategy

If the accuracy of classification recognition is relatively low, you can consider the following aspects to perform relevant tuning operations:
1. The extracted feature value is not suitable: it may be that the image has not been preprocessed, and the preprocessing process can be increased; other feature values The extraction algorithm is not considered for now.
2. In the KNN algorithm, the value of the neighbor value K is not appropriate, and multiple tests can be performed to find the appropriate K value.

the code

"""
项目:手写体数字识别项目
作者:CBX
时间:2019.01.15
备注:初稿
"""
import os
import cv2
import numpy as np
from sklearn import neighbors
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC


def my_load_data(file_name,test_size):
    """
    加载自己的数据文件
    :param file_name:
    :param test_size:
    :return:拆分之后的训练集和测试集
    """
    X = np.loadtxt(file_name,usecols=tuple(range(64)))
    Y = np.loadtxt(file_name,usecols=(64,))
    X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size)
    return X_train, X_test, Y_train, Y_test


def img_prepro(src_img):
    """
    针对一张灰度图,进行预处理,对图像进行简单的剪裁
    :param src_img: 原始图片
    :return: 预处理之后的图片
    """
    image = src_img
    row, col = image.shape
    top = row
    bottom = 0
    left = col
    right = 0
    for i in range(row):
        for j in range(col):
            if image[i, j] == 0:
                if i < top:
                    top = i
                if j < left:
                    left = j
                if i > bottom:
                    bottom = i
                if j > right:
                    right = j
    # 剪裁图像
    dst_img = image[int(top):int(bottom), int(left):int(right)]
    # 统一预处理后的图像大小
    dst_img = cv2.resize(dst_img, (8,8))
    return dst_img


def prepro(dir_name , pre_dir):
    """
    根据文件夹名,循环遍历所有图像,对每一张图像进行预处理。并保存预处理后的图像。
    :param dir_name: 被遍历的文件夹名
    :param pre_dir: 预处理后的图像保存路径
    :return: 预处理过程无异常情况则返回True,否则False
    """
    # 1.获得指定文件夹下所有的文件名
    file_name_list = os.listdir(dir_name)
    # 2.针对每一个图像进行预处理操作,循环遍历文件名列表
    for file_name in file_name_list:
        # 2.1 根据文件名,读取图像(灰度图)
        img_path = dir_name + "/" + file_name
        img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)    # 灰度图
        # 2.2 对图像进行预处理操作
        dst_img = img_prepro(img)
        # 2.3 把预处理后的图像存储
        preimg_path = pre_dir + "/" + file_name
        cv2.imwrite(preimg_path, dst_img)
    return True


def get_feature(src_img):
    """
    提取单张图片的特征值
    :param src_img:
    :return:
    """
    row, col = src_img.shape
    feature = np.array(src_img).reshape((1, row*col))
    return feature


def create_feature_file(dir_path, data_file_name):
    """
    对dir_path下的所有图像,提取特征值,并生成数据文档
    :param dir_path: 文件夹名
    :param data_file_name: 数据文档名
    :return: 无异常,则为True
    """
    # 1.获得指定文件夹下所有的文件名
    file_name_list = os.listdir(dir_path)
    # 2.针对每一个图像进行预处理操作,循环遍历文件名列表
    X = np.zeros((1, 64))
    Y = np.zeros((1,1))
    for file_name in file_name_list:
        # 2.1 根据文件名,读取图像(灰度图)
        img_path = dir_path + "/" + file_name
        img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)  # 灰度图
        # 2.2 获得当前样本的目标值
        y = int(file_name[0])
        # 2.3 提取特征值
        feature = get_feature(img)
        # 将单个样本的目标值和特征值进行拼接
        X = np.append(X, feature, axis=0)
        Y = np.append(Y, np.array(y).reshape((1,1)), axis=0)
    # 根据数据文档的特点,需要拼接X、Y
    my_set = np.append(X, Y, axis=1)
    # 将数据直接保存
    np.savetxt(data_file_name, my_set[1:,:])
    return True


def my_train_model(X_train,Y_train):
    """
    训练分类模型,这里使用KNN算法
    :param X_train: 训练集
    :param Y_train: 训练目标集
    :return: 训练之后的模型
    """
    # 使用KNN算法
    # clf = neighbors.KNeighborsClassifier(n_neighbors=1)
    # 使用SVM算法
    clf = SVC(C=200, kernel='linear')
    clf.fit(X_train,Y_train)
    return clf


def my_test_model(clf, X_test, Y_test):
    result = clf.score(X_test, Y_test)
    print("测试成功率为:"+ str(result))

def my_app(img_name, clf):
    """
    模型应用
    :param img_name: 需要进行识别(分类)的实际样本
    :param clf: 使用的训练好的模型
    :return: 分类结果
    """
    # 1.根据文件名,读取图片(灰度图)
    img = cv2.imread(img_name,cv2.IMREAD_GRAYSCALE)
    # 2.对实际样本进行预处理
    dst_img = img_prepro(img)
    # 3.提取实际样本的特征值
    feature = get_feature(dst_img)
    # 4.使用训练好的模型进行预测、识别
    result = clf.predict(feature)
    return result

# 程序入口
if __name__ == "__main__":
    #  图像预处理,输入文件路径。可以将预处理之后的图像进行保存,以便验证预处理的结果。
    # prepro("WeMNTS", "PreMNTS")
    #  提取特征值,并将特征值存储到数据文档中
    # create_feature_file("PreMNTs", "mnts_data.txt")
    # 加载特征值
    X_train, X_test, Y_train, Y_test = my_load_data("mnts_data.txt", 0.2)
    # 训练模型
    clf = my_train_model(X_train, Y_train)
    # 测试模型
    my_test_model(clf, X_test, Y_test)
    # 模型应用
    # result = my_app("9.bmp",clf)
    # print(result)

Guess you like

Origin blog.csdn.net/cbx0916/article/details/130735536