Intel oneAPI AI Hackathon - Machine Vision Challenge Case

Written in the front: The blogger is a "little mountain pig" who has devoted himself to the training business after actual development and experience. His nickname is taken from "Peng Peng" in the cartoon "The Lion King". He always treats the surrounding people with an optimistic and positive attitude. thing. My technical route has gone all the way from a Java full-stack engineer to the field of big data development and data mining, and now I have achieved some success. I would like to share with you what I have gained in the past. I hope it will be helpful to you on the way of learning. At the same time, the blogger also wants to build a complete technical library through this attempt. Any exceptions, errors, and precautions related to the technical points of the article will be listed at the end. You are welcome to provide materials in various ways.

  • Please criticize and point out any mistakes in the article, and correct them in time.
  • If you have any questions you want to discuss and learn, please contact me: [email protected].
  • The style of publishing articles varies from column to column, and they are all self-contained. Please correct me if there are any deficiencies.

Intel oneAPI AI Hackathon - Machine Vision Challenge Case

Keywords in this article: Intel, oneAPI, artificial intelligence, machine vision

1. Activity introduction

image-20230621225441252

A while ago, Intel and Station C jointly organized the oneAPI artificial intelligence hackathon , which is to use Intel's official kit to solve some problems in the field of computer vision. The event is divided into three tracks:

  • Computer Vision Challenge: Detect and Remove Weeds
  • Machine Learning: Predicting Freshwater Quality
  • oneAPI Open Innovation: Use the oneAPI artificial intelligence analysis toolkit to realize any idea

The organizer of the event provided source code cases and open class video tutorials, and the overall implementation process is also described very clearly, which is convenient for everyone to get started quickly. Although the author has been working in the field of artificial intelligence, he still has little contact with machine vision, but after checking the relevant information, he can realize his idea in a relatively short period of time, and feels that he has also acquired new skills. Sincerely like it!

2. Environmental preparation

1. Hardware requirements

Because it is based on Intel's suite for machine vision work, CPU and GPU resources will be used, and it needs to be developed on an Intel machine. Of course, if you don’t have a suitable machine, you can also use the official cloud environment. I also tried it. You can apply for the right to use it for half a year for free, and the configuration is not low, which can be said to be very conscientious.

2. Software environment

If you want to start as soon as possible, you only need to install some basic python environment and build an official case: https://github.com/idz-cn/2023hackathon/tree/main/computer-vision-track

image-20230621235212461

We can refer to the relevant code and download the required dataset: https://www.mvtec.com/company/research/datasets/mvtec-ad , which contains training pictures of many scenes, which can be used directly.

dataset_overview_large

3. Implementation of the plan

The data set used by the author is cable, which is the cable section. We can learn through training to let the model know what is qualified and what is not. Some codes and steps are listed below, for details, please refer to the official case:

1. Dataset preparation

First of all, we need to prepare the training set - train and the test set - test , and both of them need to pick out the recognized ones - good and the unrecognized ones - bad . We can manually jump these pictures from the data set, or we can randomly select them A few, or refer to the data preparation steps in the case: https://github.com/oneapi-src/visual-quality-inspection#2-data-preparation .

image-20230622000354288

2. Training and Prediction

  • Compare and display two sets of pictures

In the model training phase, we first extract some pictures from the training set for comparison and display:

import cv2

train_dir = './data/train' # image folder

# get the list of jpegs from sub image class folders
good_imgs = [fn for fn in os.listdir(f'{
      
      train_dir}/good') if fn.endswith('.png')]
bad_imgs = [fn for fn in os.listdir(f'{
      
      train_dir}/bad') if fn.endswith('.png')]

# randomly select 3 of each
select_norm = np.random.choice(good_imgs, 3, replace = False)
select_pneu = np.random.choice(bad_imgs, 3, replace = False)

# plotting 2 x 3 image matrix
fig = plt.figure(figsize = (10,10))
for i in range(6):
    if i < 3:
        fp = f'{
      
      train_dir}/good/{
      
      select_norm[i]}'
        label = 'Acceptable Cable'
    else:
        fp = f'{
      
      train_dir}/bad/{
      
      select_pneu[i-3]}'
        label = 'Defective Cable'
    ax = fig.add_subplot(2, 3, i+1)
    
    # to plot without rescaling, remove target_size
    fn = cv2.imread(fp)
    fn_gray = cv2.cvtColor(fn, cv2.COLOR_BGR2GRAY)
    plt.imshow(fn, cmap='Greys_r')
    plt.title(label)
    plt.axis('off')
plt.show()

image-20230622000743285

  • Next, the picture needs to be read in the form of data, that is, in the form of a matrix or an array
# making n X m matrix
def img2np(path, list_of_filename, size = (64, 64)):
    # iterating through each file
    for fn in list_of_filename:
        fp = path + fn
        current_image = cv2.imread(fp)
        current_image = cv2.cvtColor(current_image, cv2.COLOR_BGR2GRAY)
        
        # turn that into a vector / 1D array
        img_ts = [current_image.ravel()]
        try:
            # concatenate different images
            full_mat = np.concatenate((full_mat, img_ts))
        except UnboundLocalError: 
            # if not assigned yet, assign one
            full_mat = img_ts
    return full_mat

# run it on our folders
good_images = img2np(f'{
      
      train_dir}/good/', good_imgs)
bad_images = img2np(f'{
      
      train_dir}/bad/', bad_imgs)

def find_stat_img(full_mat, title, size = (1024, 1024)):
    # calculate the average
    mean_img = np.mean(full_mat, axis = 0)
    mean_img = mean_img.reshape(size)
    var_img = np.var(full_mat, axis = 0)
    var_img = var_img.reshape(size)
    max_img = np.amax(full_mat, axis = 0)
    max_img = max_img.reshape(size)
    min_img = np.amin(full_mat, axis = 0)
    min_img = min_img.reshape(size)
    
    figure, (ax1, ax2, ax3, ax4) = plt.subplots(1, 4, sharey=True, figsize=(15, 15))
    ax1.imshow(var_img, vmin=0, vmax=255, cmap='Greys_r')
    ax1.set_title(f'Variance {
      
      title}')
    ax2.imshow(mean_img, vmin=0, vmax=255, cmap='Greys_r')
    ax2.set_title(f'Average {
      
      title}')
    ax3.imshow(max_img, vmin=0, vmax=255, cmap='Greys_r')
    ax3.set_title(f'Max {
      
      title}')
    ax4.imshow(min_img, vmin=0, vmax=255, cmap='Greys_r')
    ax4.set_title(f'Min {
      
      title}')
    plt.show()
    return mean_img, var_img

The parameter passing here needs to be adjusted according to the image size, here you can see the effect after simple analysis:

image-20230622001305267

  • model definition

Next, define the model. Here is a direct reference to the official code:

class CustomVGG(nn.Module):

    def __init__(self, n_classes=2):
        super().__init__()
        self.feature_extractor = models.vgg16(pretrained=True).features[:-1]
        self.classification_head = nn.Sequential(
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.AvgPool2d(
                kernel_size=(INPUT_IMG_SIZE[0] // 2 ** 5, INPUT_IMG_SIZE[1] // 2 ** 5)
            ),
            nn.Flatten(),
            nn.Linear(
                in_features=self.feature_extractor[-2].out_channels,
                out_features=n_classes,
            ),
        )
        self._freeze_params()

    def _freeze_params(self):
        for param in self.feature_extractor[:23].parameters():
            param.requires_grad = False

    def forward(self, x_in):
        """
        forward
        """
        feature_maps = self.feature_extractor(x_in)
        scores = self.classification_head(feature_maps)

        if self.training:
            return scores

        probs = nn.functional.softmax(scores, dim=-1)

        weights = self.classification_head[3].weight
        weights = (
            weights.unsqueeze(-1)
            .unsqueeze(-1)
            .unsqueeze(0)
            .repeat(
                (
                    x_in.size(0),
                    1,
                    1,
                    INPUT_IMG_SIZE[0] // 2 ** 4,
                    INPUT_IMG_SIZE[0] // 2 ** 4,
                )
            )
        )
        feature_maps = feature_maps.unsqueeze(1).repeat((1, probs.size(1), 1, 1, 1))
        location = torch.mul(weights, feature_maps).sum(axis=2)
        location = F.interpolate(location, size=INPUT_IMG_SIZE, mode="bilinear")

        maxs, _ = location.max(dim=-1, keepdim=True)
        maxs, _ = maxs.max(dim=-2, keepdim=True)
        mins, _ = location.min(dim=-1, keepdim=True)
        mins, _ = mins.min(dim=-2, keepdim=True)
        norm_location = (location - mins) / (maxs - mins)

        return probs, norm_location

Next, it is necessary to repeatedly adjust the parameters to check the performance of the model, which is omitted here.

  • model training

Train the model with defined parameters, then use the test set to see how it performs.

# model training starts
# Model Training
# Intitalization of DL architechture along with optimizer and loss function
model = CustomVGG()
class_weight = torch.tensor(class_weight).type(torch.FloatTensor).to(DEVICE)
criterion = nn.CrossEntropyLoss(weight=class_weight)
optimizer = optim.Adam(model.parameters(), lr=LR)

# Ipex Optimization
model, optimizer = ipex.optimize(model=model, optimizer=optimizer, dtype=torch.float32)

# Training module
start_time = time.time()
trained_model = train(train_loader, model=model, optimizer=optimizer, criterion=criterion, epochs=EPOCHS,
    device=DEVICE, target_accuracy=TARGET_TRAINING_ACCURACY)
train_time = time.time()-start_time

# Save weights
model_path = f"{
      
      subset_name}.pt"
torch.save(trained_model.state_dict(), model_path)

image-20230622001801792

3. Evaluation and quantification

  • model evaluation

There are multiple indicators for evaluating a model, which can be viewed using methods such as evaluate.

y_true, y_pred = evaluate(trained_model, test_loader, DEVICE, labels=True)

image-20230622002125798

  • Model Quantization

After the model training is completed, it can be quantified and exported, so that it can be called with a very short code.

from neural_compressor.config import PostTrainingQuantConfig, AccuracyCriterion, TuningCriterion
from neural_compressor import quantization

# INC will not quantize some layers optimized by ipex, such as _IPEXConv2d, 
# so we need to create original model object and load trained weights
model = CustomVGG()
model.load_state_dict(torch.load(model_path))
model.to(DEVICE)
model.eval()

# define evaluation function used by INC
def eval_func(model):
    with torch.no_grad():
        y_true = np.empty(shape=(0,))
        y_pred = np.empty(shape=(0,))

        for inputs, labels in train_loader:
            inputs = inputs.to(DEVICE)
            labels = labels.to(DEVICE)
            preds_probs = model(inputs)[0]
            preds_class = torch.argmax(preds_probs, dim=-1)
            labels = labels.to("cpu").numpy()
            preds_class = preds_class.detach().to("cpu").numpy()
            y_true = np.concatenate((y_true, labels))
            y_pred = np.concatenate((y_pred, preds_class))

    return accuracy_score(y_true, y_pred)


# quantize model
conf = PostTrainingQuantConfig(backend='ipex',
                               accuracy_criterion = AccuracyCriterion(
                                   higher_is_better=True, 
                                   criterion='relative',  
                                   tolerable_loss=0.01))
q_model = quantization.fit(model,
                           conf,
                           calib_dataloader=train_loader,
                           eval_func=eval_func)

# save quantized model
# you can also find a json file saved with quantized model, which saved quantization information for each operator
quantized_model_path = './quantized_models'
if not os.path.exists(quantized_model_path):
    os.makedirs(quantized_model_path)
q_model.save(quantized_model_path)
  • model prediction

After loading the model from the specified path, you can start predicting:

q_model.to(DEVICE)
q_model.eval()
q_model = ipex.optimize(q_model)

y_true, y_pred = evaluate(q_model, test_loader, DEVICE, labels=True)

Scan the QR code below, join the official CSDN fan WeChat group, you can communicate with me directly, and there are more benefits~
insert image description here

Guess you like

Origin blog.csdn.net/u012039040/article/details/131339770