Build computer vision models with TensorFlow

What is Computer Vision?

Computer vision (CV) is the main task of modern artificial intelligence (AI) and machine learning (ML) systems. It is accelerating nearly every area of ​​industry, enabling organizations to revolutionize the way machines and business systems work.

Academically, it is a mature field of computer science, and decades of research work has gone into it to enrich it. The use of deep neural networks has recently revolutionized the field and given it new life.

Computer vision has various application areas such as:

  • Autopilot
  • Medical Image Analysis and Diagnosis
  • Detect manufacturing defects
  • Image and video analysis of surveillance footage
  • Facial recognition for security systems

Of course, there are many challenges associated with CV systems. For example, autonomous driving uses not only object detection , but also object classification, segmentation, motion detection , etc.

Most importantly, these systems are expected to process CV information and make high-probability decisions within fractions of a second . A higher-level supervisory control system has to make decisions and is responsible for the ultimate driving task.

Also,  multiple CV systems/algorithms typically play a role in any respectable autonomous driving system. In these cases, the need for parallel processing is high, which causes high stress on the underlying computing machine.

If multiple neural networks are used simultaneously , they may share common system memory and compete with each other for a common resource pool.

In the case of medical imaging, the performance of computer vision systems is  judged by experienced radiologists and clinical professionals who understand the pathology behind the images. Also, in most cases, the task involves identifying rare diseases with very low prevalence .

This makes the training data sparse and sparse, i.e. not enough training images can be found. Therefore, deep learning (DL) architectures must compensate for this by adding intelligent processing and architectural complexity.

Why choose TensorFlow for CV?

TensorFlow is a widely used and highly regarded open source Python package from Google that makes building deep learning models for computer vision straightforward. From its official website:

" It has a comprehensive and flexible ecosystem of tools, libraries, and community resources that enable researchers to advance the state of the art in ML, and developers to easily build and deploy ML-powered applications. "

With the release of TensorFlow 2.0 and the integration of the Keras library as a high-level API, it is easy to stack layers of neurons and build and train sufficiently complex deep learning architectures.

Easily
Build Models Easily build and train ML models using intuitive, high-level APIs like Keras for instant model iteration and easy debugging

Powerful ML Production Anywhere
No matter what programming language you use, you can easily train and deploy models in the cloud, locally, or on devices. 

Powerful Research
Features A simple and flexible architecture takes new ideas from concept to code. Publish state-of-the-art models with confidence

Now, of course, TensorFlow can be used to build deep learning models for a variety of applications, including,

  • object detection
  • scene segmentation
  • Generative Adversarial Networks for Synthetic Images
  • Autoencoders for Image Compression
  • Recommended system

However, in this article, we focus on code and practical examples for building a simple object classification task using TensorFlow using a convolutional neural network (CNN) .

This covers all the basic components of TensorFlow such as layers, optimizers, error functions, training options, hyperparameter tuning, etc.

A practical example of an object classification task

Deep learning tasks and model training widely benefit from dedicated hardware such as gaming processing units (GPUs).

General-purpose CPUs have trouble processing large amounts of data, for example,  performing linear algebra operations on matrices with tens or hundreds of thousands of floating-point numbers.

Under the hood, deep neural networks are mostly composed of operations like matrix multiplication and vector addition. GPUs (mainly catering to the video game industry) were developed to handle massively parallel computing using thousands of tiny computing cores.

They also have large memory bandwidth to handle the fast data flow required for these computations as neural networks train for hundreds of epochs (processing units cache to slower main memory and back). This makes them ideal commodity hardware for handling the computational load of computer vision tasks .

Use and check GPU

There are several ways to use GPUs for deep learning tasks. Purchasing a bare metal server or workstation can be beneficial for those looking for the ultimate in customizability. Renting GPU compute resources on the cloud from AWS or GCP is great for those running short sessions with limited data. Or use a free (but limited) resource such as Google Colaboratory . The larger the dataset, the better the model, but in this example we use the last option, Google Colab.

We first test whether a GPU is used for training,

import tensorflow as tf
device_name = tf.test.gpu_device_name()

if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')

print('Found GPU at: {}'.format(device_name))

get dataset

We use the following code to directly download the dataset into our local environment (Google Colab). If you are working on your local machine (and have saved the file), then modify the code accordingly.

! wget --no-check-certificate \ https:  // storage . Google APIs  . com  /  laurencemoroney-blog . _Application  point . com  / horse or man . compression\

-O  /  tmp  / horse -or - man . compression

Access compressed file content

The following python code will use the OS library to use the operating system library, which gives you access to the file system and the zipfile library, which allows you to decompress the data.

Guess you like

Origin blog.csdn.net/wouderw/article/details/128029039