AutoML speeds up, Google open source automation to find the best new platform for ML model

In order to help researchers develop the best machine learning model automatically and efficiently, Google has open sourced an AutoML platform that is not specific to a specific field. The platform is built on TensorFlow and is very flexible. It can find the most suitable architecture for a given data set and problem, as well as minimize programming time and computing resources.

AutoML speeds up, Google open source automation to find the best new platform for ML model

The success of neural networks usually depends on the generalization performance on a variety of tasks. However, designing such a neural network is very difficult because the research community still does not fully understand how neural networks generalize: what kind of neural network is suitable for a given problem? How deep is it? Which layer should be used? LSTM layer is fine, or is it better to use Transformer? Or a combination of the two? Will integration or distillation improve model performance?

The AutoML algorithm that has emerged in recent years can help researchers automatically find a suitable neural network without manual experimentation. Technologies such as Neural Architecture Search (NAS) use algorithms such as reinforcement learning, evolutionary algorithms, and combined search to construct neural networks based on a given search space. With proper settings, the neural network architecture found by these technologies is better than the manually designed network architecture. However, these algorithms are computationally expensive and require training thousands of models before converging. Moreover, the search space they explore is domain-specific, including a large amount of prior human knowledge, and cannot achieve cross-domain migration well. For example, in the field of image classification, traditional NAS technology searches for two good building blocks (convolution and downsampling), and then follows conventions to create a complete network.

In order to overcome these shortcomings and extend the AutoML solution to a wider research community, Google recently opened up Model Search, a platform that automatically and efficiently builds optimal ML models . The platform is not specific to a specific domain, so it is flexible enough to find the most suitable architecture for a given data set and problem, while minimizing programming time and computing resources. The platform is built on the TensorFlow framework and can run on a single machine or on a distributed machine setup.

AutoML speeds up, Google open source automation to find the best new platform for ML model

GitHub address: https://github.com/google/model_search

Model Search platform overview

The Model Search system includes multiple trainers, a search algorithm, a transfer learning algorithm, and a database storing multiple evaluation models. The system can run training and evaluation experiments of multiple machine learning models (using different architectures and training methods) in an adaptive and asynchronous manner. When each trainer performs training separately, all trainers share the knowledge obtained from the experiment.

At the beginning of each round, the search algorithm finds all completed trials and uses beam search to decide which part to try next. After that, the search algorithm invokes mutations on one of the best architectures found so far, and assigns the generated model back to the trainer.

AutoML speeds up, Google open source automation to find the best new platform for ML model

Schematic diagram of Model Search, showing the distributed search and integration process.

The system uses a set of predefined blocks to build a neural network model, where each block represents a known micro-architecture, such as LSTM, ResNet, or Transformer layer. By using these pre-existing architectural components, Model Search can leverage the best knowledge available in cross-domain NAS research. This method is more efficient because it explores structure rather than more basic and detailed components, thereby reducing the size of the search space.

AutoML speeds up, Google open source automation to find the best new platform for ML model

Various neural network microarchitecture blocks can work well, such as the ResNet block.

In addition, because the Model Search framework is built on TensorFlow, each block can implement any function that takes tensors as input. For example, if we want to propose a new search space based on a series of micro-architectures, the Model Search framework will absorb the newly defined blocks and merge them into the search process to ensure that the algorithm can build the best based on the provided components Neural network. The building block here can even be a fully defined neural network that can solve some specific problems. In this case, Model Search can be used as a powerful integrated machine.

The search algorithms implemented in Model Search are adaptive, greedy and incremental, so these algorithms converge faster than reinforcement learning algorithms. However, these algorithms will also simulate the "explore and exploit" feature of the reinforcement learning algorithm. The specific operation is to first separate the search to find excellent candidates (the exploration step), and then integrate these discovered candidates To improve accuracy (that is, use steps).

After making random changes to the architecture or training method (such as increasing the depth of the architecture), the main search algorithm makes adaptive modifications to perform one of the best k experiments (where k is specified by the user).

AutoML speeds up, Google open source automation to find the best new platform for ML model

A dynamic display of the network evolving in multiple experiments.

In order to further improve efficiency and accuracy, transfer learning can also be used between different internal experiments. Model Search implements transfer learning in two ways, namely knowledge distillation and weight sharing . Knowledge distillation improves the accuracy of candidate objects by adding a loss item that matches the prediction of the efficient model. Weight sharing is by copying the appropriate weights in the previous training model and randomly initializing the remaining weights, bootstrap some parameters from the previously trained candidates (after mutation). This method can not only speed up the training process, but it is also possible to find more and better architectures.

Experimental result

Model Search uses the least number of iterations to improve the production model. Google researchers have demonstrated the performance of Model Search in the speech field in a recent paper "Improving Keyword Spotting and Language Identification via Neural Architecture Search at Scale", which can discover keyword detection and language recognition models. In less than 200 iterations, the model obtained by Model Search is better than the internal SOTA production model designed by experts, and the former has about 130,000 fewer training parameters (184K parameters vs. 315K parameters).

AutoML speeds up, Google open source automation to find the best new platform for ML model

The model accuracy rate obtained after a given number of iterations of Model Search is compared with the performance of the previous keyword detection production model.

Google researchers also used Model Search to find a suitable image classification architecture on the CIFAR-10 image data set. After using a set of known convolution blocks (including convolution, resnet module, NAS-A unit, fully connected layer, etc.), Model Search can quickly achieve benchmark accuracy after 209 trials (that is, only 209 models have been explored) --91.83. The previous top architecture requires much more trials to achieve the same accuracy rate. For example, the NASNet algorithm requires 5807 trials, and PNAS requires 1160 trials.

At present, the code of Model Search is open source, and researchers can use this flexible and domain-free framework to discover ML models.

Guess you like

Origin blog.csdn.net/weixin_42137700/article/details/113934444
Recommended