How to use datavec in deeplearning4j to process images

1. Introduction

    As we all know, the image is composed of three colors of red, green and blue. Using deeplearning to process the image, the image must be converted into a tensor. Each image is composed of three-dimensional tensors. The three-dimensional are [depth, height, width], Intuitively, three matrices are stacked, each representing a channel, as shown in the figure below. Sometimes it is necessary to perform complex operations such as rotation, scaling, cropping, shrinking and filling of pictures. Without a useful tool, the entire operation process is very cumbersome. Today, let's talk about the processing of images by DL4J's datavec.

                                     

Second, datavec-data-image code structure

    datavec-data-image is an image processing library packaged by dl4j based on opencv, which can easily process images into tensors. The code is divided into three important parts, loader (image loader) and transform (converter).

    1. Loader: Image loader, which is mainly used to load and convert images into tensors. The entire class structure is as follows

 

2. The transform converter is mainly used for operations such as rotation, scaling, and cropping of image tensors. Here are a few more important converters.

    ResizeImageTransform: Scale the image

    FlipImageTransform: Flip the image, for example, the line is reversed left and right

    CropImageTransform: Crop the image

    BoxImageTransform: Fix the image to a fixed size, crop if the image is larger than the range, and fill with 0 if it is smaller than the range

    PipelineImageTransform: A chain converter that can process images through a pipeline, such as scaling first, rotating, flipping, etc.

    RotateImageTransform: Rotate the image, such as rotating 30, 60 and other angles

3. Code example

    1. NativeImageLoader reads the image and converts it into a 4-dimensional tensor. The reason why it is four-dimensional here is because the minibatch dimension is added. If only one image is read, the minibatch dimension is 1

NativeImageLoader originalLoad = new NativeImageLoader(112, 112, 3);
INDArray image = loader.asMatrix(new File("/root/1.jpg"));

    2. Upside down and left side upside down

 NativeImageLoader Labelloader = new NativeImageLoader(112, 112, 3,new FlipImageTransform(-1));//上下,左右颠倒

    3. Zoom the picture

NativeImageLoader smallLoader = new NativeImageLoader(112, 112, 3, new ResizeImageTransform(80, 80));

    4. Rotate the picture, 60 degrees, 90 degrees, 120 degrees

NativeImageLoader loader60 = new NativeImageLoader(112, 112, 3,new RotateImageTransform(60)); 
NativeImageLoader loader90 = new NativeImageLoader(112, 112, 3,new RotateImageTransform(90)); 
NativeImageLoader loader120 = new NativeImageLoader(112, 112, 3,new RotateImageTransform(120));

    5. Chain processing, first rotate 60 degrees, and fix to the center of 224*224

NativeImageLoader pipeline = new NativeImageLoader(112, 112, 3,new PipelineImageTransform(new RotateImageTransform(60),new BoxImageTransform(224,224))  );
		

 

Happiness comes from sharing.

   This blog is original by the author, please indicate the source for reprinting

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324225121&siteId=291194637