Machine Learning (2): Using OpenCV with Node.js for Intelligent Pet Recognition

Privious

OpenCV (Open Source Computer Vision Library) is a computer vision library based on C/C++ language, which has a wide range of applications in cross-platform image/video processing, pattern recognition, human-computer interaction, robotics and other fields.

OpenCV includes machine learning libraries that support traditional machine learning algorithms (decision trees, naive Bayes, support vector machines, random forests, etc.), and recent version evolutions focus on enhancing deep learning support. For example, OpenCV 3.3 integrates deep neural networks ( Deep neural networks (DNN) were promoted to the main repository (promote DNN module from opencv_contrib to the main repository), and OpenCV 3.4, released in December 2017, was optimized for R-CNN.

The main programming of OpenCV is C++, and most of the interfaces are also based on C++, but it still retains many C interfaces (with incomplete functions). The binding languages ​​(binding) are Python, java and MATLAB/OCTAVE, and there are some other language wrappers (wrappers) such as C#, Perl, Haskell and Ruby. The opencv4nodejs project is a Node.js binding that supports all OpenCV 3, which helps to make up for the lack of computer vision implementation in JavaScript, and provides more choices in the selection of Node.js advantageous application scenarios (such as using WebSocket push technology to create real-time web application).

A review of the previous article: Machine Learning (1): Implementing Intelligent Recognition of Pet Bloodline Based on TensorFlow A case of image recognition is demonstrated. Let's take a look at how it is implemented based on OpenCV + Node.js:

Using OpenCV with Node.js

Enviroment

$ cmake --version
cmake version 3.10.2

$ brew install cmake

$ brew install opencv3
# dependencies for opencv: eigen, lame, x264, xvid, ffmpeg, libpng,
# libtiff, ilmbase, openexr, gdbm, python, xz, python3, numpy, tbb

$ mkdir project-opencv-demo
$ cd project-opencv-demo
$ npm init
$ npm install --save opencv4nodejs

Load InceptionModel

Tensorflow Inception Model is a model that has been trained to recognize thousands of objects, and output a predicted classification probability as long as the image is input. Tensorflow Inception Model includes two files 'graph.pb' and 'label_strings.txt', which need to be loaded before use.

const cv = require('opencv4nodejs');
//const cv = require('../');
const fs = require('fs');
const path = require('path');

if (!cv.xmodules.dnn) {
  throw new Error('exiting: opencv4nodejs compiled without dnn module');
}

// replace with path where you unzipped inception model
const inceptionModelPath = './models/tf-inception'

const modelFile = path.resolve(inceptionModelPath, 'tensorflow_inception_graph.pb');
const classNamesFile = path.resolve(inceptionModelPath, 'imagenet_comp_graph_label_strings.txt');
if (!fs.existsSync(modelFile) || !fs.existsSync(classNamesFile)) {
  console.log('exiting: could not find inception model');
  console.log('download the model from: https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip');
  return;
}
console.log('load models:'+inceptionModelPath)

// read classNames and store them in an array
const classNames = fs.readFileSync(classNamesFile).toString().split("\n");

// initialize tensorflow inception model from modelFile
const net = cv.readNetFromTensorflow(modelFile);

Image Classify

Read the image and store it in Blob format, call net.forward() (image as input parameter), here we only output the classification with probability higher than 5%.

const classifyImg = (img) => {
  // inception model works with 224 x 224 images, so we resize
  // our input images and pad the image with white pixels to
  // make the images have the same width and height
  const maxImgDim = 224;
  const white = new cv.Vec(255, 255, 255);
  const imgResized = img.resizeToMax(maxImgDim).padToSquare(white);

  // network accepts blobs as input
  const inputBlob = cv.blobFromImage(imgResized);
  net.setInput(inputBlob);

  // forward pass input through entire network, will return
  // classification result as 1xN Mat with confidences of each class
  const outputBlob = net.forward();

  // find all labels with a minimum confidence
  const minConfidence = 0.05;
  const locations =
    outputBlob
      .threshold(minConfidence, 1, cv.THRESH_BINARY)
      .convertTo(cv.CV_8U)
      .findNonZero();

  const result =
    locations.map(pt => ({
      confidence: parseInt(outputBlob.at(0, pt.x) * 100) / 100,
      className: classNames[pt.x]
    }))
      // sort result by confidence
      .sort((r0, r1) => r1.confidence - r0.confidence)
      .map(res => `${res.className} (${res.confidence})`);

  return result;
}

Test

const testData = [
  {
    image: './data/IMG_3560.png',
    label: 'Yan Dog'
  },
  {
    image: './data/IMG_3608.png',
    label: 'Yang Dog'
  }
];

testData.forEach((data) => {
  const img = cv.imread(data.image);
  console.log('%s,%s: ', data.image,data.label);

  const predictions = classifyImg(img);
  predictions.forEach(p => console.log(p));

  //cv.imshowWait('img', img);

  console.log("---------finish---------");
});

IMG_3608.png

$ npm run tf-classify

> node ./tf-classify.js
load models:./models/tf-inception

-------------------------------
./data/IMG_3560.png,Yan Dog:
[ INFO:0] Initialize OpenCL runtime...
潘布魯克威尔斯柯基犬 Pembroke (0.83)
-------------------------------
./data/IMG_4423.png,Yang Dog:
吉娃娃 Chihuahua (0.89)
Pembroke (0.07)
-------------------------------
./data/IMG_3608.png,Yang Dog:
玩具梗 toy terrier (0.22)
美国斯塔福德郡梗 American Staffordshire terrier (0.2)
吉娃娃 Chihuahua (0.14)
斯塔福德郡牛头梗 Staffordshire bullterrier (0.12)
比特犬 whippet (0.05)
-------------------------------

Question: Compared with the previous Machine Learning (1): The predicted value of intelligent recognition of pet pedigree based on TensorFlow , the results of the two recognitions are very close, but they are different. Why is this? Please watch for subsequent updates.

Appendix: OpenCV Overview

OpenCV version

The first preview version of OpenCV was released at the IEEE Conference in 2000, and there is currently an official version every 6 months, developed by an independent group sponsored by a commercial company. OpenCV 1.0: Released in 2006 OpenCV 2.0: Released in October 2009, major updates include C++ interface OpenCV 2.3: Released in June 2011, major updates include mobile terminal compatibility (NDK-Build) OpenCV 3.0: Released in June 2015 OpenCV 3.3: Released in August 2017, major updates include deep learning (promote DNN module from opencv_contrib to the main repository) OpenCV 3.4: Released in December 2017, major updates include DNN module improvements (including R-CNN performance optimization), Javascript binding Defining and OpenCL implementation

#查看版本
$ pkg-config --modversion opencv
3.4.0

OpenCV main modules

  • cv core function library
  • cvaux helper library
  • cxcore data structure and linear algebra library
  • highgui GUI library, including user interface, read/write image and video
  • ml machine learning function library, including statistical models, Bayesian, nearest neighbors, support vector machines, decision trees, random trees, maximum expectation, neural networks, etc. See Machine Learning:Machine Learning Algorithms for details .
  • gpu GPU acceleration, GPU modules and data structures, including image processing and analysis modules

Main functions of OpenCV

  • Image data operations (memory allocation and release allocation & release, image copying, setting and conversion setting & conversion)
  • Matrix/vector data manipulation and linear algebra operations (matrix product, matrix equation solving, eigenvalues, singular value decomposition)
  • Supports a variety of dynamic data structures (linked lists, queues, datasets, trees, graphs)
  • Basic image processing (denoising, edge detection, corner detection, sampling and interpolation, color transformation, morphological processing, histogram, image pyramid structure)
  • Structural Analysis (Connected Domain/Branch, Contour Processing, Distance Transformation, Image Moments, Template Matching, Hough Transform, Polynomial Approximation, Curve Fitting, Ellipse Fitting, Delaunay Triangulation)
  • Image/video input and output (support file or camera input, image/video file output)
  • Camera calibration (find and track calibration mode, parameter calibration, fundamental matrix estimation, homography matrix estimation, stereo matching)
  • Motion analysis (optical flow, motion segmentation, motion segmentation, target tracking)

OpenCV basic data types

  • CvPoint: Represents a two-dimensional point whose coordinates are integers
  • CvSize: Indicates the size of the matrix box, in pixels.
  • CvRect: Determine a rectangular area by the coordinates of the upper left corner of the square and the height and width of the square
  • CvScalar: used to store pixel values ​​(double arrays, not necessarily grayscale values)
typedef  struct  CvPoint
{
    int x;//图像中点的x坐标
    int y;//图像中点的y坐标
}

typedef struct CvSize
{
    int width; //矩形宽
    int height; //矩形高
}

typedef struct CvRect   
{   
    int x; //方形的左上角的x-坐标   
    int y; //方形的左上角的y-坐标 
    int width; //宽   
    int height; //高
}  

typedef struct CvScalar
{
    double val[4];
}

OpenCV and Machine Learning

OpenCV includes machine learning libraries that support the following algorithms:

  • Boosting
  • Decision tree learning
  • Gradient boosting trees
  • Expectation-maximization algorithm
  • k-nearest neighbor algorithm
  • Naive Bayes classifier
  • Artificial neural networks
  • Random forest
  • Support vector machine (SVM)
  • Deep neural networks (DNN) (OpenCV 3.3 promote DNN module from opencv_contrib to the main repository)

OpenCV resources

Further reading: "The Machine Learning Master"

For more exciting content, scan the code and follow the official account: RiboseYim's Blog WeChat public account

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325257062&siteId=291194637