Cat Recognition Using Deep Learning

0 Preface

Artificial intelligence small homework, using deep learning to identify cats in photos, the requirement is to use only a single-layer neural network.

1. Principle

1.1 Principle

The principle of this experiment is mainly by inputting an image, then flattening the image, putting it into a single-layer fully connected neural network for training, and outputting after training.

1.2 Linear model

The linear model is the most common type of model. In most tasks, linear models are the most effective.
insert image description here

Using the most common error expression of the linear model, we need to constantly adjust w to make the loss loss smaller by judging whether the error is smaller than our threshold.
insert image description here

1.3 Gradient Descent

1.4 Activation function

The activation function used in this experiment is the sigmoid function. The main function of the activation function is to provide the nonlinear modeling ability of the network. If there is no activation function, the network can only express a linear map. At this time, even if there are more hidden layers, the entire network is equivalent to a single-layer neural network. Therefore, it can be considered that only after the activation function is added, the deep neural network has the ability to learn hierarchical nonlinear mapping. Below is the formula for the activation function.
insert image description here

1.2 Operating environment

The code uses pytorch for experiments, and the python version is 3.7

python 3.7
cuda

1.3 Dataset

2. Code

import json
import os
import sys

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, datasets
from tqdm import tqdm
# device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# print("using {} device.".format(device))

data_transform = {
    
    
    "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),

    "val": transforms.Compose([transforms.Resize(256),
                                   transforms.CenterCrop(224),
                                   transforms.ToTensor(),
                                   transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}

data_root = os.path.abspath(os.path.join(os.getcwd(), "./"))  # get data root path
image_path = os.path.join(data_root, "data", "cats_and_dogs_v2")  # flower data set path
#assert 在表达式条件为 False 的时候触发异常。 断言可以在条件不满足程序运行的情况下直接返回错误,而不必等待程序运行后出现崩溃的情况。
assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
                                         transform=data_transform["train"])
train_num = len((train_dataset))
print(train_num)

train_loader = torch.utils.data.DataLoader(train_dataset,batch_size=1, shuffle=True,num_workers=0)



class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.lin1 = torch.nn.Linear(150528, 800)
        self.lin2 = torch.nn.Linear(800, 400)
        self.lin3 = torch.nn.Linear(400, 2)
        self.sigmoid = torch.nn.Sigmoid()
    def forward(self, x):
        x = self.sigmoid(self.lin1(x))
        x = self.sigmoid(self.lin2(x))
        x = self.sigmoid(self.lin3(x))
        return x


model = Model()
model = model.cuda()

criterion = torch.nn.CrossEntropyLoss() #这是损失函数吗?
criterion = criterion.cuda()
#SGD是随机梯度下降(stochastic gradient descent)的首字母
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)


for epoch in range(100):
    correct = 0
    for i, data in enumerate(train_loader, 0):  # train_loader 是先shuffle后mini_batch
        inputs, labels = data #input是输入tensor,label是标签
        inputs = inputs.cuda()
        labels = labels.cuda()
        inputs = torch.reshape(inputs,[1,150528])
        y_pred = model(inputs)
        if(torch.argmax(y_pred) == labels):
            correct += 1
        loss = criterion(y_pred, labels)

        optimizer.zero_grad() # 清除网络状态
        loss.backward() # loss反向传播
        optimizer.step() # 更新参数
    print('第', epoch, '轮的准确度是:%d %%' % (100 * correct / train_num))



Guess you like

Origin blog.csdn.net/qq_43471945/article/details/128265053