1. Introduction
As we all know, the image is composed of three colors of red, green and blue. Using deeplearning to process the image, the image must be converted into a tensor. Each image is composed of three-dimensional tensors. The three-dimensional are [depth, height, width], Intuitively, three matrices are stacked, each representing a channel, as shown in the figure below. Sometimes it is necessary to perform complex operations such as rotation, scaling, cropping, shrinking and filling of pictures. Without a useful tool, the entire operation process is very cumbersome. Today, let's talk about the processing of images by DL4J's datavec.
Second, datavec-data-image code structure
datavec-data-image is an image processing library packaged by dl4j based on opencv, which can easily process images into tensors. The code is divided into three important parts, loader (image loader) and transform (converter).
1. Loader: Image loader, which is mainly used to load and convert images into tensors. The entire class structure is as follows
2. The transform converter is mainly used for operations such as rotation, scaling, and cropping of image tensors. Here are a few more important converters.
ResizeImageTransform: Scale the image
FlipImageTransform: Flip the image, for example, the line is reversed left and right
CropImageTransform: Crop the image
BoxImageTransform: Fix the image to a fixed size, crop if the image is larger than the range, and fill with 0 if it is smaller than the range
PipelineImageTransform: A chain converter that can process images through a pipeline, such as scaling first, rotating, flipping, etc.
RotateImageTransform: Rotate the image, such as rotating 30, 60 and other angles
3. Code example
1. NativeImageLoader reads the image and converts it into a 4-dimensional tensor. The reason why it is four-dimensional here is because the minibatch dimension is added. If only one image is read, the minibatch dimension is 1
NativeImageLoader originalLoad = new NativeImageLoader(112, 112, 3);
INDArray image = loader.asMatrix(new File("/root/1.jpg"));
2. Upside down and left side upside down
NativeImageLoader Labelloader = new NativeImageLoader(112, 112, 3,new FlipImageTransform(-1));//上下,左右颠倒
3. Zoom the picture
NativeImageLoader smallLoader = new NativeImageLoader(112, 112, 3, new ResizeImageTransform(80, 80));
4. Rotate the picture, 60 degrees, 90 degrees, 120 degrees
NativeImageLoader loader60 = new NativeImageLoader(112, 112, 3,new RotateImageTransform(60));
NativeImageLoader loader90 = new NativeImageLoader(112, 112, 3,new RotateImageTransform(90));
NativeImageLoader loader120 = new NativeImageLoader(112, 112, 3,new RotateImageTransform(120));
5. Chain processing, first rotate 60 degrees, and fix to the center of 224*224
NativeImageLoader pipeline = new NativeImageLoader(112, 112, 3,new PipelineImageTransform(new RotateImageTransform(60),new BoxImageTransform(224,224)) );
Happiness comes from sharing.
This blog is original by the author, please indicate the source for reprinting