Comprehensive summary of machine learning hyperparameter tuning (with code)

Public account: Youerhut
Author: Peter
Editor: Peter

Hello everyone, I am Peter~

The topic of this article: Hyperparameter tuning for machine learning modeling . The opening picture:

8c1231aadb8fa50c2b846fb834e50bd2.png

The article is very long, it is recommended to save it directly~

1. What are machine learning hyperparameters?

Machine learning hyperparameters are parameters whose values ​​are set before starting the learning process, rather than parameter data obtained through training.

Hyperparameters are options that are set outside of model training and are not optimized or changed during training. Instead, they need to be set manually before training and have a large impact on the performance of the model.

2. Why should we tune machine learning hyperparameters?

In machine learning, hyperparameters often need to be selected and tuned for a specific task. For example, in a support vector machine (SVM), an important hyperparameter is the regularization parameter C, which can control the model complexity and affect the generalization ability of the model. When training neural networks, learning rate and batch size are also common hyperparameters, which can affect the convergence speed of the model and the final prediction effect.

Tuning of machine learning hyperparameters is to find an optimal set of hyperparameter combinations that allow the model to perform best on a specific task. Tuning hyperparameters is very important for improving model performance, preventing overfitting, and accelerating convergence.

Different hyperparameter combinations can significantly affect the performance of the model, so it is necessary to find the best hyperparameter combination through tuning.

0f546a9512ff5af8ab6eeb5d4ba90d81.png

The following describes four aspects: direct tuning methods such as grid search, tuning tools such as Optuna, AutoML-based tuning, and algorithm-based tuning.

3. Hyperparameter tuning method

Commonly used hyperparameter tuning methods include the following:

  • Grid Search: Grid search is a simple hyperparameter tuning method. It calculates the performance of each set of parameters on the validation set by exhaustively specifying parameter combinations, and finally selects the parameters with the best performance. combination.

  • Bayesian optimization: Bayesian optimization is an optimization algorithm that uses Bayes theorem and optimization methods to find the global optimal solution. It is suitable for high-dimensional, high-cost, limited sample optimization problems.

  • Random Search: Random search is a hyperparameter tuning method based on random sampling. It finds the optimal solution by randomly selecting a combination of parameters in the parameter space.

3.1 Grid Search Grid Search

1. What is grid search?

Grid Search is a hyperparameter tuning method that exhausts specified parameter combinations, calculates the performance of each set of parameters on the validation set, and finally selects the parameter combination with the best performance.

f0f41319684e49291aa8abd6162d52b8.png
https://pyimagesearch.com/2021/05/24/grid-search-hyperparameter-tuning-with-scikit-learn-gridsearchcv/

Grid search is a simple yet effective tuning method often used to determine the best combination of hyperparameters.

2. Python practice of grid search

from sklearn.model_selection import GridSearchCV  
from sklearn.svm import SVC  
from sklearn.datasets import load_iris  
  
# 加载数据集  
iris = load_iris()  
X = iris.data  
y = iris.target  
  
# 定义模型  
svm = SVC(kernel='linear', C=100, gamma='auto')  
  
# 定义网格搜索参数范围  
param_grid = {  
    'C': [0.1, 1, 10, 100],  
    'gamma': [1e-3, 1e-2, 1e-1, 1],  
}  
  
# 创建网格搜索对象  
grid_search = GridSearchCV(svm, param_grid, cv=5)  
  
# 对数据进行网格搜索  
grid_search.fit(X, y)  
  
# 输出最佳参数组合和对应的得分  
print('Best parameters:', grid_search.best_params_)  
print('Best score:', grid_search.best_score_)
  • Grid search is implemented using the GridSearchCV class in the Scikit-learn library

  • A parameter grid (param_grid) is defined, which contains different value combinations of the two hyperparameters C and gamma.

  • Created a GridSearchCV object and passed in the parameter grid, SVM model and cross-validation (cv) parameters

  • Use best_params_ and best_score_ attributes to output the best parameter combination and corresponding score

3.2 Random Search Random Search

1. What is random search?

Random search is an optimization method that searches for possible solutions by randomly generating points within an allowed range and calculating the objective function value for each point. Then, it selects the next point to search based on the objective function value to gradually approach the optimal solution.

e25b0339476646d0a9d7243846ac5697.png

This method is suitable for handling high-dimensional, nonlinear, non-convex or discontinuous optimization problems, especially when the computational cost of exact solutions is very high.

2. Python practice based on random search

import numpy as np  
from sklearn.datasets import load_iris  
from sklearn.ensemble import RandomForestClassifier  
  
# 加载数据集  
iris = load_iris()  
X = iris.data  
y = iris.target  
  
# 定义随机搜索函数  
def random_search(X, y, model, param_space, iteration_num):  
    best_score = -1  
    best_params = None  
    for i in range(iteration_num):  
        # 从参数空间中随机采样一组超参数  
        params = {k: v[np.random.randint(len(v))] for k, v in param_space.items()}  
        # 训练模型并计算验证集上的准确率  
        model.set_params(**params)  
        score = model.score(X[:100], y[:100])  
        # 更新最优解  
        if score > best_score:  
            best_score = score  
            best_params = params  
    return best_score, best_params  
  
# 定义随机森林分类器模型  
model = RandomForestClassifier()  
  
# 定义超参数空间  
param_space = {  
    'n_estimators': [100, 200, 300, 400, 500],  
    'max_depth': [2, 3, 4, 5, 6],  
    'max_features': ['auto', 'sqrt', 'log2'],  
    'bootstrap': [True, False]  
}  
  
# 执行随机搜索  
best_score, best_params = random_search(X, y, model, param_space, 100)  
print('最佳准确率:', best_score)  
print('最佳超参数:', best_params)
  • Use a random forest classifier model and define four hyperparameters that need to be optimized: n_estimators, max_depth, max_features and bootstrap

  • Randomly sample 100 sets of hyperparameters from the parameter space, then use the accuracy on the validation set to evaluate the quality of these hyperparameters, and finally output the best accuracy and corresponding best hyperparameters.

Comparison between grid search optimization and random search optimization:

c7b035942bf5cb1c7dc2be385737d60f.jpeg

3.3 Bayesian optimization

1. What is Bayesian optimization?

Bayesian optimization is a black-box optimization algorithm used to solve extreme value problems of functions with unknown expressions. It is based on Bayes' theorem and describes the posterior distribution of the objective function by building a probability model, and uses this model to select the next sampling point to maximize the sampling value.

The core idea is to use Gaussian Process Regression (GPR) to model the distribution of the objective function . GPR believes that the objective function is a random process composed of a series of training data points (input and output), and uses a Gaussian probability model to describe the probability distribution of this random process. Bayesian optimization updates the posterior distribution of the objective function by continuously adding sample points until the posterior distribution basically fits the true distribution.

398509dcb92127fe81be8a62b53e11aa.png

Bayesian optimization has two core processes:

  • Prior Function (PF): PF mainly uses Gaussian process regression to model the prior distribution of the objective function.

  • Acquisition Function (AC): AC mainly includes methods such as Expected Improvement (EI), Probability of Improvement (PI) and Upper Confidence Bound (UCB), which are used to measure the contribution of each point to the optimization of the objective function and select the next a sampling point.

Bayesian optimization is used in the AutoML algorithm in machine learning to automatically determine the hyperparameters of the machine learning algorithm. It is also a global extreme value search method, especially suitable for high-dimensional nonlinear non-convex functions, with good effect and efficiency.

Bayesian optimization treats a function as a stochastic process that satisfies a certain distribution. By finding the function value within the definition domain, the Bayesian formula is used to update the estimate of the distribution, and then find the most likely extreme point location based on the new distribution. , thereby improving the accuracy of estimating the function and its extreme values.

2. Bayesian optimization in python practice

import numpy as np  
from scipy.optimize import minimize  
from sklearn.gaussian_process import GaussianProcessRegressor  
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C  
  
# 定义目标函数  
def f(x):  
    return np.sin(5 * x) + np.cos(x)  
  
# 定义高斯过程回归模型  
kernel = C(1.0, (1e-3, 1e3)) * RBF(10, (1e-2, 1e2))  
gpr = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=10)  
  
# 定义贝叶斯优化函数  
def bayesian_optimization(X_train, y_train, X_test):  
    # 训练高斯过程回归模型  
    gpr.fit(X_train, y_train)  
    # 计算测试集的预测值和方差  
    y_pred, sigma = gpr.predict(X_test, return_std=True)  
    # 计算期望改进(Expected Improvement)  
    gap = y_pred - f(X_test)  
    improvement = (gap + np.sqrt(sigma ** 2 + 1e-6) * np.abs(gap).mean()) * 0.5  
    # 计算高斯过程回归模型的超参数  
    result = minimize(gpr.kernel_, np.zeros(gpr.kernel_.shape[0]), method='L-BFGS-B')  
    hyperparameters = result.x  
    # 输出最优超参数和对应的期望改进值  
    return hyperparameters, improvement.max()  
  
# 定义贝叶斯优化的迭代次数和采样点数量  
n_iter = 20  
n_samples = 5  
  
# 进行贝叶斯优化  
results = []  
for i in range(n_iter):  
    # 在定义域内随机采样n_samples个点  
    X_train = np.random.uniform(-2 * np.pi, 2 * np.pi, (n_samples, 1))  
    y_train = f(X_train)  
    # 进行贝叶斯优化并记录最优超参数和对应的期望改进值  
    result = bayesian_optimization(X_train, y_train, X_test=np.random.uniform(-2 * np.pi, 2 * np.pi, (100, 1)))  
    results.append(result)  
    print('Iter: {}, Hyperparameters: {:.2f}, Expected Improvement: {:.4f}'.format(i, result[0][0], result[1]))
  • Define the objective function f

  • Use Gaussian process regression model (GPR) to model the distribution of the objective function

  • Define the Bayesian optimization function bayesian_optimization; the training set, test set and number of sampling points are used as inputs, and the optimal hyperparameters and corresponding expected improvement values ​​are output.

  • Expected Improvement serves as an acquisition function, updating the posterior distribution of the objective function by continuously adding sample points, and using the L-BFGS-B method to minimize the hyperparameters of the Gaussian process regression model.

4. Tuning tools based on hyperparameters

4.1 What is a hyperparameter optimization library?

Hyperparameter Optimization Library is a software library or tool for automated hyperparameter optimization. These libraries use different algorithms and techniques to automate the hyperparameter search and optimization process.

Hyperparameter optimization libraries often provide easy-to-use interfaces that allow users to define the hyperparameters and objective functions to be optimized. They use different algorithms and techniques such as grid search, random search, genetic algorithm, Bayesian optimization, etc. to search and optimize the hyperparameter space. The goals of these libraries are to reduce the workload of manual tuning of hyperparameters, improve model performance, and accelerate the training process of machine learning models.

4.2 Common hyperparameter optimization tools

The following are several commonly used hyperparameter optimization libraries:

  • scikit-Opitimize

  • Hyperopt

  • Opt

  • Spearmint

  • Gaussian Process-based Hyperparameter Optimization (GPGO)

  • Ray Tune

  • GPyOpt

  • SigOpt

  • Loud Tuner

These libraries all provide different algorithms and tools to achieve hyperparameter optimization, and have different features and advantages. You can choose the library that best suits you based on your needs.

4.3Scikit-Optimize library

1. Introduction to Scikit-Optimize library

Scikit-optimize is a Python library for executing scipy-based optimization algorithms. It aims to provide a simple yet effective tool for optimization problems in machine learning and scientific computing. Scikit-optimize provides many different optimization algorithms, including gradient descent, random search, Bayesian optimization, etc.

9d25aff7edebad491b627172cd4b7c46.png

Scikit-optimize provides many predefined search spaces and objective functions to easily set up hyperparameter optimization tasks. Users can define their own search spaces and objective functions to suit specific machine learning models and tasks. Scikit-optimize also provides visualization tools to help users better understand the optimization process and results.

2. Practical optimization of Scikit-Optimize library based on python

import numpy as np  
from sklearn.datasets import load_iris  
from sklearn.ensemble import RandomForestClassifier  
from sklearn.model_selection import train_test_split  
from sklearn.metrics import accuracy_score  
from skopt import gp_minimize  
  
# 加载数据集  
iris = load_iris()  
X = iris.data  
y = iris.target  
  
# 划分训练集和测试集  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  
  
# 定义机器学习模型  
clf = RandomForestClassifier()  
  
# 定义搜索空间和目标函数  
search_space = {  
    'n_estimators': [100, 200, 500],  
    'max_depth': [3, 5, None],  
    'max_features': ['auto', 'sqrt', 'log2']  
}  
  
def objective(x):  
    clf.set_params(**{search_space[name]: x[name] for name in x})  
    clf.fit(X_train, y_train)  
    y_pred = clf.predict(X_test)  
    return -accuracy_score(y_test, y_pred)  
  
# 执行优化算法  
res = gp_minimize(objective, search_space, n_calls=20, random_state=42)  
  
# 输出最优超参数组合和对应的验证准确率  
print("Best hyperparameters:", res.x)  
print("Max validation accuracy:", res.fun)

4.4 Hyperopt library

1. Introduction to Hyperopt library

Hyperopt is a Python library for intelligent search of algorithms for machine learning models. It mainly uses three algorithms: random search algorithm, simulated annealing algorithm and TPE (Tree-structured Parzen Estimator) algorithm.

64bb7720851366d66f13494e49245a52.png

Install the Hyperopt library: You can use the pip command to install the Hyperopt library:

pip install hyperopt

Steps for usage:

  • Prepare the objective function: The objective function should be an optimizable function that accepts a list of hyperparameters as input and returns a scalar value. In Hyperopt, use fn to specify the objective function.

  • Define the hyperparameter search space: Use Hyperopt's hp module to define the hyperparameter search space. You can use functions such as hp.choice and hp.uniform to define different types of hyperparameters.

  • Optimize using fmin function: Optimize using Hyperopt’s fmin function, which accepts the objective function, hyperparameter search space and optimization algorithm as input and returns the best hyperparameter combination.

Official learning address: https://github.com/hyperopt/hyperopt

2. Practical cases based on python optimization

from hyperopt import hp, fmin, tpe  
from sklearn.datasets import load_iris  
from sklearn.model_selection import train_test_split  
from sklearn.svm import SVC  
  
# 加载数据集  
iris = load_iris()  
X = iris.data  
y = iris.target  
  
# 划分训练集和测试集  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  
  
# 定义目标函数  
def objective(params):  
    clf = SVC(kernel=params['kernel'], C=params['C'], gamma=params['gamma'])  
    clf.fit(X_train, y_train)  
    score = clf.score(X_test, y_test)  
    return -score  # 因为Hyperopt需要最小化目标函数,所以我们需要将准确率取反  
  
# 定义超参数搜索空间  
space = {  
    'kernel': hp.choice('kernel', ['linear', 'poly', 'rbf', 'sigmoid']),  
    'C': hp.uniform('C', 0.001, 10),  
    'gamma': hp.uniform('gamma', 0.001, 10)  
}  
  
# 使用贝叶斯优化进行超参数搜索  
best = fmin(fn=objective, space=space, algo=tpe.suggest, max_evals=100)  
print('Best parameters: ', best)

4.5 Optuna library

1. Introduction to Optuna library

Optuna is a library for hyperparameter optimization that supports defining objective functions, searching the hyperparameter space and automatically optimizing.

6270948a66bb143185547b5621ae8291.png

You can use the pip command to install the Optuna library:

pip install Optuna

Steps for usage:

  • Define the search space: Use the distribution function provided by Optuna to define the search space of hyperparameters. For example, for a floating-point number with a value range of [0, 1], you can use the uniform function to define the search space of the hyperparameter.

  • Define the objective function: The objective function is the model that needs to be optimized, and can be any callable object, such as Python functions, class methods, etc. The input of the objective function is the value of the hyperparameter, and the output is the performance indicator of the model.

  • Create an Optuna experiment: Create an Optuna experiment object and specify the objective function and search algorithm.

  • Run an Optuna experiment: Run an Optuna experiment to perform a hyperparameter search. After each experiment, Optuna will update the values ​​of the hyperparameters and record the performance indicators of the current experiment. You can set the number of attempts or time to control the size of the search space and the search time limit.

  • Analyze test results: After the test is completed, you can use the visualization tools provided by Optuna to analyze the test results and select the optimal hyperparameter combination.

2. Use cases based on python

import numpy as np  
from sklearn.datasets import make_regression  
from sklearn.model_selection import train_test_split  
from sklearn.linear_model import SGDRegressor  
  
# 生成数据集  
X, y = make_regression(n_samples=100, n_features=1, noise=0.1)  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  
  
# 定义目标函数  
def objective(trial):  
    # 定义超参数  
    learning_rate = trial.suggest_float('learning_rate', 1e-10, 1e-5)  
    batch_size = trial.suggest_int('batch_size', 32, 256)  
    optimizer = trial.suggest_categorical('optimizer', ['sgd', 'adam'])  
  
    # 创建模型  
    model = SGDRegressor(learning_rate=learning_rate, batch_size=batch_size, optimizer=optimizer)  
  
    # 训练模型  
    model.fit(X_train, y_train)  
  
    # 计算性能指标  
    loss = np.mean((model.predict(X_test) - y_test) ** 2)  
    return loss

# 创建Optuna的study对象,并指定需要优化的目标函数和搜索空间
import optuna  
study = optuna.create_study()  
study.optimize(objective, n_trials=100)

# 最佳的超参数组合
print(study.best_params)  # 输出最佳的超参数组合  
print(study.best_value)   # 输出最佳的性能指标值

4.6Spearmint library

1. Introduction to Spearmint library

Spearmint is a library for optimizing Bayesian inference. It is based on the algorithm outlined in the paper "Practical Bayesian Optimization". This library can be used to perform Bayesian optimization, an algorithm for global optimization that primarily finds configurations that minimize an objective function. (It is recommended to use less)

721a0591b1c51d46faa46cd16ad6af7a.png

https://github.com/JasperSnoek/spearmint

The Spearmint library provides a way to formulate and optimize Bayesian inference problems. It allows users to define objective functions as well as constraints and boundaries used to describe optimization problems. The core algorithms in the library are responsible for optimizing the objective function according to these definitions. Advantages of the Spearmint library include:

  • Flexible objective function expression: Spearmint supports multiple types of data and complex model structures, allowing users to flexibly express various objective functions.

  • High performance: Spearmint improves running speed through efficient implementation and optimization algorithms, allowing it to handle large-scale data sets and complex models.

  • Easy to use: Spearmint provides a simple and easy-to-use interface, allowing users to easily configure and run optimization tasks.

  • Community support: Spearmint is an open source project supported by an active community, which means users can receive feedback and support from other developers.

2. Practical cases based on Spearmint

import numpy as np  
import spearmint as sp  
from scipy.stats import norm  
  
# 定义目标函数  
def objective(x):  
    # 假设目标函数是一个简单的二次函数,其中x是一个向量  
    return x[0]**2 + x[1]**2  
  
# 定义Spearmint的优化器  
spearmint = sp.Spearmint()  
  
# 设置要优化的参数范围  
var_names = ['var1', 'var2']  
bounds = [[-5, 5], [-5, 5]]  
priors = [sp.priors.NormalPrior(0, 1), sp.priors.NormalPrior(0, 1)]  
  
# 运行贝叶斯优化,设置最大迭代次数为10次  
results = spearmint.optimize(objective, var_names, bounds, priors, n_iter=10)  
  
# 输出最优的参数组合和对应的函数值  
print('最优参数组合:', results.x)  
print('最优函数值:', results.func)

4.7GPGO library

1. Introduction to GPGO

The full name of GPGO is Gaussian Process Optimization for Hyperparameters Optimization. It is Google's hyperparameter optimization library. It uses Gaussian Processes (Gaussian processes) for hyperparameter optimization and is specially designed for TensorFlow and Keras.

The Gaussian process is a powerful non-parametric Bayesian model that provides a probabilistic framework for hyperparameter optimization that automatically manages the exploration versus exploitation trade-off.

Install before using:

pip install gpgo

Steps for usage:

  • First, the objective function to be optimized needs to be defined (e.g., the training loss of a neural network).

  • Secondly, the search space needs to be defined, that is, the possible value range of the hyperparameters.

  • Then, run the optimization process using GPGO. In each iteration, the optimizer selects a set of hyperparameters and evaluates the performance of that set of hyperparameters using an objective function.

  • Finally, the optimizer updates its beliefs about the optimal hyperparameters based on the evaluation results and continues the search until a preset termination condition is reached (e.g., the maximum number of iterations is reached or a satisfactory combination of hyperparameters is found).

2. Python practical cases

import numpy as np  
import tensorflow as tf  
from tensorflow.keras.datasets import mnist  
from tensorflow.keras.models import Sequential  
from tensorflow.keras.layers import Dense, Flatten  
from gpgo import GPGO  
  
# 加载MNIST数据集
(x_train, y_train), (x_test, y_test) = mnist.load_data()  
# 数据缩放
x_train = x_train / 255.0  
x_test = x_test / 255.0  
  
# 1-定义神经网络模型  
model = Sequential([  
    Flatten(input_shape=(28, 28)),  
    Dense(128, activation='relu'),  
    Dense(64, activation='relu'),  
    Dense(10)  
])  
  
#2- 定义优化目标函数(即神经网络的训练损失)  
def loss_fn(y_true, y_pred):  
    return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_pred))  
  
# 3-定义优化器并设置超参数的搜索空间  
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)  
hyperparams = {  
    'kernel__lgcp': [5, 10, 20],  
    'kernel__scale': [1e-4, 1e-3, 1e-2],  
    'kernel__lengthscale': [1., 5., 10.]  
}  
  
# 4-创建GPGO优化器并运行优化过程  
gpgo = GPGO(optimizer=optimizer, loss_fn=loss_fn, hyperparams=hyperparams, datasets=(x_train, y_train), epochs=10)  
best_params, best_loss = gpgo.run()
  • Load the MNIST dataset and define a simple neural network model

  • The optimization objective function is defined, which is the training loss of the neural network

  • Created a GPGO optimizer and passed it the search space of hyperparameters

  • Use runmethod to run the optimization process and output the best hyperparameters and lowest training loss

4.8Ray Tune library

1 Introduction

Ray Tune is a hyperparameter optimization library for Ray that can be used to optimize the performance of deep learning models. It is based on the interfaces of Scikit-learn and Google Vizier and provides a simple and easy-to-use API to define and run hyperparameter search experiments.

221674ef7ce8ba8e9d0153030220558f.png

Ray Tune features include:

  • Supports a variety of hyperparameter optimization algorithms, such as grid search, random search, and Bayesian optimization.

  • It can automatically manage the progress and results of experiments and provide a clear visual interface.

  • It can be easily extended to distributed environments and supports multi-node parallel hyperparameter search.

  • Seamlessly integrates with the Ray framework and can be easily extended to other Ray tasks.

Official learning address: https://docs.ray.io/en/latest/tune/index.html

2. Optimization practical cases provided by the official website:

from ray import tune

def objective(config):  # 1-定义目标优化函数
    score = config["a"] ** 2 + config["b"]
    return {"score": score}

search_space = {  # 2-定义超参数搜索空间
    "a": tune.grid_search([0.001, 0.01, 0.1, 1.0]),
    "b": tune.choice([1, 2, 3]),
}

tuner = tune.Tuner(objective, param_space=search_space)  # 3-执行搜索过程
results = tuner.fit()
print(results.get_best_result(metric="score", mode="min").config)  # 最佳超参数组合

4.9 Bayesian Optimization库

1 Introduction

Bayesian optimization, also known as Bayesian experimental design or sequential design, is a global optimization technique that uses Bayesian reasoning and Gaussian process. It is a method to find the maximum or minimum value of an unknown function in as few iterations as possible, especially suitable for optimizing high-cost functions or situations where a balance between exploration and development is required.

Official learning address: https://github.com/bayesian-optimization/BayesianOptimization

9af62fc220364c9335871483f2d54172.png

Install first:

pip install bayesian-optimization

2. Python practical cases

import numpy as np  
from sklearn.gaussian_process import GaussianProcessRegressor  
from scipy.optimize import minimize  
  
# 定义目标函数  
def target_function(x):  
    return np.sin(5 * x) + np.cos(2 * x) + np.random.normal(0, 0.1, size=x.shape)  
  
# 定义高斯过程模型  
def gpr_model(X, y):  
    kernel = 1.0 * np.eye(X.shape[0]) + 0.5 * np.ones((X.shape[0], X.shape[0]))  
    gpr = GaussianProcessRegressor(kernel=kernel)  
    gpr.fit(X, y)  
    return gpr  
  
# 定义Bayesian Optimization函数  
def bayesian_optimization(n_iterations, n_initial_points):  
    # 初始化数据点  
    x_initial = np.random.uniform(-5, 5, n_initial_points)  
    y_initial = target_function(x_initial)  
  
    # 初始化高斯过程模型  
    X_train = np.vstack((x_initial, x_initial))  
    y_train = np.vstack((y_initial, y_initial))  
    gpr = gpr_model(X_train, y_train)  
  
    # 进行n_iterations次迭代  
    for i in range(n_iterations):  
        # 使用EI策略选择新的数据点  
        acquisition_func = gpr.predictive_mean + np.sqrt(gpr.predictive_covariance(x_initial)[:, None, None]) * np.random.randn(*x_initial.shape)  
        EI = -gpr.negative_log_predictive_density(y_initial, acquisition_func)  
        x_new = x_initial[np.argmax(EI)]  
        y_new = target_function(x_new)  
        
        X_train = np.vstack((X_train, x_new))  
        y_train = np.vstack((y_train, y_new)) 
        
        gpr = gpr_model(X_train, y_train)  
        
        x_initial = np.vstack((x_initial, x_new))  
        y_initial = np.vstack((y_initial, y_new)) 
        
        print("Iteration {}: Best value = {} at x = {}".format(i+1, np.min(y_initial), np.argmin(y_initial)))  
    return x_initial[np.argmin(y_initial)], np.min(y_initial)  
  
# 运行Bayesian Optimization函数并输出结果  
best_x, best_y = bayesian_optimization(n_iterations=10, n_initial_points=3)  
print("Best x = {}, best y = {}".format(best_x, best_y))
  • An objective function is defined target_function, which is a combination of sine and cosine functions with random noise;

  • A Gaussian process model is defined gpr_modelthat uses a constant kernel to construct the Gaussian process;

  • Defines a bayesian_optimizationfunction that uses the Bayesian optimization algorithm to find the maximum value of the objective function;

  • Finally call bayesian_optimizationthe function and output the result

4.9GPyOpt library

1 Introduction

GPyOpt is a Python library based on GPy for implementing Bayesian optimization. It provides a flexible framework that can handle optimization problems with various types of surrogate models such as Gaussian processes and random forests.

b961f3a8e917b41b961008cbd3858c82.png

The GPyOpt library is designed to solve practical problems, including but not limited to function optimization, hyperparameter optimization, model parameter adjustment in deep learning, etc. Users can customize the proxy model as needed and can easily integrate with third-party libraries. In addition, GPyOpt supports a variety of optimization algorithms, such as Bayesian optimization, particle swarm optimization, etc., to meet the needs of different application scenarios.

Official learning address: https://sheffieldml.github.io/GPyOpt/

Install directly using pip:

pip install gpyopt

Source code based installation:

# git clone https://github.com/SheffieldML/GPyOpt.git
# cd GPyOpt
# git checkout devel
# nosetests GPyOpt/testing

Version requirements for the three main dependent packages:

  • GPy (>=1.0.8)

  • numpy (>=1.7)

  • scipy (>=0.16)

2. Use cases based on python

Use the GPyOpt library to solve a simple function optimization problem: try to find the maximum value of a function

import numpy as np  
from gpyopt import Optimizer  
  
# 定义目标函数  
def f(x):  
    return -x**2  
  
# 定义初始点  
x0 = np.array([0.0])  
# 定义优化器  
optimizer = Optimizer(f=f, x0=x0)  
# 设置优化选项  
optimizer.set_verbose(True)  # 设置是否输出优化信息  
optimizer.set_stop_condition(stop='max_iter', value=100)  # 设置最大迭代次数和停止条件  
  
# 运行优化  
optimizer.optimize()  
  
# 输出结果  
print('最优解:', optimizer.x_opt)  
print('最优值:', optimizer.f_opt)

4.10SigOpt library

1 Introduction

SigOpt hyperparameter optimization library is a software library for optimizing machine learning models. SigOpt's optimization algorithm uses Bayesian optimization, an optimization algorithm used to find the global optimum, often used to find the best combination of hyperparameters in deep learning models.

a639a08b3243d4cdab7486f0476e360e.png

SigOpt's API makes it easy to tune model hyperparameters and can be integrated with many different machine learning libraries (including TensorFlow, PyTorch, Scikit-learn, etc.). SigOpt also provides a visual interface to help users monitor and adjust the optimization process. By using SigOpt, developers can find optimal hyperparameter combinations faster, improving model performance and accuracy.

Official learning address: https://docs.sigopt.com/intro/main-concepts

pip based installation:

pip install sigopt

2. Practical cases based on python

import torch  
import torch.nn as nn  
from torch.utils.data import DataLoader  
from torchvision import datasets, transforms
# 优化相关
import sigopt  
from sigopt.api import create_run  
from sigopt.config import get_version_id  
from sigopt.local import local_experiment  
from sigopt.run import run_fn, get_default_args, get_default_config, get_default_options, get_default_suggestion_callback, run_in_notebook, run_in_script, run_in_jupyter_notebook   # 用于定义运行函数和运行环境  
from sigopt.suggestion import get_suggestion, get_suggestion_from_web  # 用于获取建议的超参数值


# 定义学习率和批量大小
space = {  
  "learning_rate": (1e-5, 1e-2, "log-uniform"),  
  "batch_size": (32, 256, "uniform")  
}

# 定义基于pytorch的深度学习模型
class SimpleModel(nn.Module):  
  def __init__(self):  
    super(SimpleModel, self).__init__()  
    self.fc = nn.Linear(784, 10)  
      
  def forward(self, x):  
    x = x.view(-1, 784)  
    x = self.fc(x)  
    return x

# 加载mnist数据集
transform = transforms.Compose([  
  transforms.ToTensor(),    # 转成张量
  transforms.Normalize((0.1307,), (0.3081,))  # 数据标准化  
])  
train_dataset = datasets.MNIST(root="data", train=True, transform=transform, download=True)  
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)

# 优化过程
def objective(config):  # 定义优化目标函数,即模型的准确率  
  model = SimpleModel()  # 创建模型实例  
  optimizer = torch.optim.Adam(model.parameters(), lr=config["learning_rate"])  # 创建优化器并设置学习率  
  criterion = nn.CrossEntropyLoss()  # 创建损失函数  
  train_loss = 0.0  
  correct = 0.0  
  total = 0.0  
  for i, data in enumerate(train_loader, 0):  # 对训练集进行迭代,计算损失和准确率  
    inputs, labels = data  # 获取输入和标签数据  
    optimizer.zero_grad()  # 清零梯度缓存  
    outputs = model(inputs)  # 前向传播,计算输出张量  
    loss = criterion(outputs, labels)  # 计算损失张量  
    loss.backward()  # 反向传播,计算梯度张量并更新权重参数  
    train_loss += loss.item()  # 累加损失值并计算平均损失值和准确率 
    
    _, predicted = torch.max(outputs.data, 1)  # 找到最大概率的标签作为预测结果并计算准确率  
    total += labels.size(0)  # 累加样本数以计算总体准确率  
    correct += predicted.eq(labels.data).cpu().sum()  # 累加正确预测的样本数并计算准确率  
  accuracy = correct / total  # 计算总体准确率并返回给优化器作为目标函数值  
  return accuracy, {"learning_rate": config["learning_rate"], "batch_size": config["batch_size"]}  # 将目标函数值和超参数值返回给优化器

4.11 Keras Tuner

KerasTuner is an easy-to-use distributed hyperparameter optimization framework that solves some of the pain points when performing hyperparameter searches. It utilizes advanced search and optimization methods, such as HyperBand search and Bayesian optimization, to help find optimal neural network hyperparameters.

4b2182634a69198ef403524a63523470.png

Installation command:

pip install keras-tuner --upgrade

Official learning address: https://keras.io/keras_tuner/

2. Practical cases

import keras_tuner
from tensorflow import keras

# 定义网络模型
def build_model(hp):
    model = keras.Sequential()
    model.add(keras.layers.Dense(
        hp.Choice("units", [8,16,32]),
        activation="relu"
    ))
    model.add(keras.layers.Dense(1,activation="relu"))
    model.compile(loss="mse")
    return model

tuner = keras_tuner.RandomSearch(
    build_model,
    objective="val_loss",
    max_trials=5
)

tuner.search(x_train, y_train, epochs=5, validation_data=(x_val, y_val))

Complete case: https://www.analyticsvidhya.com/blog/2021/06/tuning-hyperparameters-of-an-artificial-neural-network-leveraging-keras-tuner/

5. Hyperparameter tuning based on AutoML library

5.1 What is the AutoML library?

An Automated Machine Learning Library (AutoML) is a software tool or library designed to automate machine learning workflows. These libraries use different algorithms and techniques to automate the process of machine learning tasks, including data preprocessing, feature selection, model selection, parameter optimization, model evaluation, etc.

The purpose of AutoML is to simplify the process of machine learning so that non-professionals can also apply machine learning, or to help professionals handle machine learning tasks more efficiently. These libraries often provide easy-to-use interfaces and are capable of handling large-scale data sets.

5.2 Common automated machine learning libraries

There are several types of automated machine learning libraries:

  • Auto-Sklearn . Auto-Sklearn is an open source AutoML library built on the scikit-learn package. It finds the best performing model as well as the best set of hyperparameters for a given dataset. It includes some feature engineering techniques such as single-point encoding, feature normalization, dimensionality reduction, etc. This library is suitable for small and medium-sized data sets, but not for large data sets.

  • H2O AutoML . H2O AutoML is a complete end-to-end machine learning automation tool that can handle various types of data sets, including small and big data, standard and non-standard data. It automates the entire machine learning process, including data preparation, model selection, feature selection, model optimization, etc.

  • Auto-Keras . Auto-Keras is an automatic machine learning library based on the Keras deep learning framework. It aims to provide a highly automated design and training process for deep learning models to solve the problem that users may need to repeatedly write a large amount of code when facing different tasks.

  • AutoGluon . AutoGluon is an open source deep learning automatic machine learning library developed and maintained by the AWS team. It is designed to help developers automate all processes of machine learning, including data preprocessing, feature engineering, model selection, and hyperparameter adjustment. AutoGluon uses a technology called "neural architecture search" to automatically select the best model architecture.

  • Pycaret . PyCaret is an open source, low-code machine learning library designed to simplify machine learning workflows and increase productivity. It is a Python library that encapsulates several popular machine learning libraries and frameworks, such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, etc. PyCaret also provides a set of easy-to-use APIs that can help you complete various machine learning tasks, including data preprocessing, model training, evaluation, and deployment.

  • MLBox . MLBox is a powerful automated machine learning library designed to provide machine learning engineers and researchers with a one-stop machine learning solution. Developed by a domestic company, it encapsulates a variety of popular machine learning algorithms and frameworks, such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, etc., and integrates them into a unified The framework provides simple and easy-to-use APIs, making the machine learning process more efficient and convenient.

  • Auto-PyTorch . Auto-PyTorch is an automatic machine learning library based on PyTorch. It aims to provide automated solutions for hyperparameter search and model selection to help users improve efficiency and accuracy when training deep learning models.

5.3Auto-Sklearn

1 Introduction

Auto-Sklearn is an open source AutoML library that leverages the popular Scikit-Learn machine learning library for data transformation and machine learning algorithms. Auto-Sklearn was developed by Matthias Feurer et al. described in their 2015 paper titled "efficient and robust automated machine learning".

0b31688e7be033884402ed0088092b09.png

Auto-Sklearn automatically selects the best learning algorithm and its hyperparameters through meta-learning or gradient boosting. It can also automatically adjust model complexity and integration, as well as perform data pre- and post-processing.

Installation command:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple auto-sklearn

Official learning address: https://automl.github.io/auto-sklearn/master/

2. Case study

import autosklearn.classification  # 基于自动化机器学习库的分类模型
import sklearn.model_selection
import sklearn.datasets
import sklearn.metrics

if __name__ == "__main__":
    X, y = sklearn.datasets.load_digits(return_X_y=True)
    # 数据切分
    X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, random_state=1)
    # 基于自动化机器学习库的分类模型
    automl = autosklearn.classification.AutoSklearnClassifier()
    # 模型训练和预测
    automl.fit(X_train, y_train)
    y_hat = automl.predict(X_test)
    print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_hat))

5.4H2O AutoML

1 Introduction

H2O AutoML is an automated machine learning tool developed by H2O.ai. It enables data analysts and scientists to build high-quality predictive models faster and easier by automating processes and techniques in the field of machine learning.

a09eda396b070da70385b7cdd31ed483.png

H2O AutoML supports a variety of algorithms and model selections, including tree-based methods, linear models, and deep learning models. The tool also provides functions such as automatic feature engineering, model cross-validation, and hyperparameter optimization, which can help users automatically perform processes such as data cleaning, feature engineering, model selection, and tuning, thereby improving the accuracy and efficiency of the model.

Official website learning address: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html

2. Use cases

import h2o
from h2o.automl import H2OAutoML

# 启动H2O
h2o.init()

# 导入一个二分类数据集
train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
test = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_test_5k.csv")

x = train.columns
y = "response"
x.remove(y)

# 转成向量
train[y] = train[y].asfactor()
test[y] = test[y].asfactor()

# 训练模型:最多20个基线模型
aml = H2OAutoML(max_models=20, seed=1)
aml.train(x=x, y=y, training_frame=train)

# 查看模型效果 
lb = aml.leaderboard
lb.head(rows=lb.nrows) 

# 预测
preds = aml.predict(test)
# preds = aml.leader.predict(test)

# 模型排行输出
lb = h2o.automl.get_leaderboard(aml, extra_columns = "ALL")

Get the effect of a single model:

m = aml.leader
# 等效
m = aml.get_best_model()

# 使用非默认指标获取最佳模型
m = aml.get_best_model(criterion="logloss")
# 基于默认的排序指标获取XGBoost模型
xgb = aml.get_best_model(algorithm="xgboost")
# 基于logloss指标获取 XGBoost 
xgb = aml.get_best_model(algorithm="xgboost", criterion="logloss")
# 指定获取某个特殊的模型
m = h2o.get_model("StackedEnsemble_BestOfFamily_AutoML_20191213_174603")

# 获取模型的参数信息
xgb.params.keys()
# 特定参数
xgb.params['ntrees']

Get H2O’s training log and time information:

log = aml.event_log  # 日志
info = aml.training_info  # 时间

5.5Auto-Hard

1 Introduction

Auto-Keras is an automated machine learning (AutoML) tool designed to simplify the construction and optimization of deep learning models. It utilizes Neural Architecture Search (NAS) technology to automatically select appropriate model structures and hyperparameters, thereby greatly reducing the tedious work of manual adjustment.

Auto-Keras uses intelligent search algorithms to automatically search for the best model structure and hyperparameters suitable for the data set, thereby providing the best model performance. It provides a simple and easy-to-use interface, allowing even users without in-depth understanding of deep learning to quickly build powerful deep learning models.

outside_default.png

Auto-Keras supports a variety of data types, including traditional structured data, images, text, and time series. It also provides many advanced features, such as automated model selection, simple user interface, support for multiple data types, etc., allowing users without professional knowledge to quickly build powerful machine learning models

Install Auto-Keras:

pip install autokeras

Corresponding version requirements for python and TensorFlow: Python >= 3.7 and TensorFlow >= 2.8.0 .

Official learning address: https://autokeras.com/

2. Practical cases

import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist

import autokeras as ak

# 导入数据集
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape)  # (60000, 28, 28)
print(y_train.shape)  # (60000,)
print(y_train[:3])  # array([7, 2, 1], dtype=uint8)

# 创建模型并训练
clf = ak.ImageClassifier(overwrite=True, max_trials=1)
clf.fit(x_train, y_train, epochs=10)
# 预测模型
predicted_y = clf.predict(x_test)
# 评估模型
clf.evaluate(x_test, y_test)

Use the validation data set:

clf.fit(
    x_train,
    y_train,
    validation_split=0.15,   # 验证集比例
    epochs=10,
)

# 手动切分验证集
split = 50000
x_val = x_train[split:]
y_val = y_train[split:]
x_train = x_train[:split]
y_train = y_train[:split]
clf.fit(
    x_train,
    y_train,
    validation_data=(x_val, y_val),
    epochs=10,
)

Custom search space:

input_node = ak.ImageInput()
output_node = ak.ImageBlock(
    block_type = "resnet",
    normalize=True,  # 标准化
    augment=False  # 没有数据增强
)(input_node)

output_node = ak.Classification()(output_node)

clf = ak.AutoModel(inputs=input_node, 
                  outputs=output_node,
                  overwrite=True,
                  max_trials=1)

clf.fit(x_train,y_train, epochs=10)

5.6AutoGluon

1 Introduction

AutoGluon is an automated machine learning framework developed and maintained by the AWS team. It can automatically adjust hyperparameters and select the best deep learning model to solve regression and classification problems. AutoGluon provides an easy-to-use interface to control automated machine learning processes and can be integrated with other commonly used deep learning frameworks.

ee887b18bbe2c8142a14da58b84eee56.png

The principle of AutoGluon is based on the idea of ​​automated machine learning. It uses a series of algorithms and technologies to realize the automated machine learning process. Among them, AutoGluon uses a technology called "neural architecture search" to automatically select the best model architecture. In addition, AutoGluon also provides some practical functions, such as automatic data enhancement, automatic model selection, automatic adjustment of learning rate, etc.

Installation under Windows 10 system:

https://auto.gluon.ai/stable/install.html

conda create -n myenv python=3.9 -y   # 创建虚拟环境  指定python版本
conda activate myenv  # 进入虚拟环境

pip install -U pip
pip install -U setuptools wheel
# 安装torch相关
pip install torch==1.13.1+cpu torchvision==0.14.1+cpu -f https://download.pytorch.org/whl/cpu/torch_stable.html
# 安装
pip install autogluon

Official website learning address: https://auto.gluon.ai/stable/index.html

AutoGluon requires Python version 3.8, 3.9, or 3.10 and is available on Linux, MacOS, and Windows.

2. Practical cases

import pandas as pd
import numpy as np
np.random.seed=42

import warnings
warnings.filterwarnings("ignore")

from autogluon.tabular import TabularDataset, TabularPredictor
# 读取在线数据
data_url = 'https://raw.githubusercontent.com/mli/ag-docs/main/knot_theory/'
train = TabularDataset(f'{data_url}train.csv')
# 数据基本信息
train.shape  # (10000, 19)
train.columns 
# 结果
Index(['Unnamed: 0', 'chern_simons', 'cusp_volume',
       'hyperbolic_adjoint_torsion_degree', 'hyperbolic_torsion_degree',
       'injectivity_radius', 'longitudinal_translation',
       'meridinal_translation_imag', 'meridinal_translation_real',
       'short_geodesic_imag_part', 'short_geodesic_real_part', 'Symmetry_0',
       'Symmetry_D3', 'Symmetry_D4', 'Symmetry_D6', 'Symmetry_D8',
       'Symmetry_Z/2 + Z/2', 'volume', 'signature'],
      dtype='object')

# 目标变量
label  = "signature"
train[label].describe()
# 模型训练
predictor = TabularPredictor(label=label).fit(train)
# 模型预测
test = TabularDataset(f"{data_url}test.csv")
pred = predictor.predict(test.drop(columns=label))

# 模型评估
predictor.evaluate(test, silent=True)
# 结果
{'accuracy': 0.9448,
 'balanced_accuracy': 0.7445352845015228,
 'mcc': 0.9323703476874563}

# 对比不同模型
predictor.leaderboard(test, silent=True)

5.7PyCaret

1 Introduction

PyCaret is an open source, low-code machine learning library designed to simplify machine learning workflows and increase productivity. It is a Python library that encapsulates several popular machine learning libraries and frameworks, such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, etc.

Official learning address: https://pycaret.org/

b9eb0bdd05dcef83e62fc4b34c5e27d8.png

Install pycaret based on Tsinghua source:

pip install pycaret  -i  https://pypi.tuna.tsinghua.edu.cn/simple

# 相关依赖安装
pip install pycaret[analysis]
pip install pycaret[models]
pip install pycaret[tuner]
pip install pycaret[mlops]
pip install pycaret[parallel]
pip install pycaret[test]
pip install pycaret[analysis,models]

Version requirements:

  • Python 3.7, 3.8, 3.9, and 3.10

  • Ubuntu 16.04 or later

  • Windows 7 or later

2. Practical cases

A practical case on a binary classification problem, using the built-in data set

from pycaret.datasets import get_data
from pycaret.classification import *

data = get_data("diabetes")
# 查看数据基本信息

# 1-函数式API
s = setup(data, target="Class variable",session_id=123)

# 2-OOP API
from pycaret.classification import ClassificationExperiment
s = ClassificationExperiment()
s.setup(data, target="Class variable", session_id=123)

# 比较不同模型

# 函数式API
# best = compare_models()
# OOP-API
best = s.compare_models()

# 模型分析

# 函数式API
# evaluate_model(best)
# OOP-API
s.evaluate_model(best)

# 模型预测

# 函数式API
# predict_model(best)
# OOP-API
s.predict_model(best)

To save and load models use:

# functional API
# save_model(best, 'my_first_pipeline')
# OOP API
s.save_model(best, 'my_first_pipeline')

# functional API
# loaded_model = load_model('my_first_pipeline')
# OOP API
loaded_model = s.load_model('my_first_pipeline')
print(loaded_model)

5.8MLBox

1 Introduction

MLBox is a powerful automated machine learning library designed to provide machine learning engineers and researchers with a one-stop machine learning solution.

MLBox provides a variety of data preprocessing, feature engineering, model selection, and hyperparameter optimization functions to help users quickly build and evaluate various machine learning models. It also supports many types of tasks, including classification, regression, clustering, anomaly detection, etc. In addition, MLBBox also provides some additional functions, such as model interpretation and model deployment, which can help users better understand and apply machine learning models. It provides the following functions:

  • Quickly perform data reading and distributed data preprocessing/cleaning/formatting.

  • High-reliability feature selection and information leakage detection.

  • Precise hyperparameter optimization in high-dimensional space.

  • State-of-the-art predictive models (Deep Learning, Stacking, LightGBM, etc.) for classification and regression.

  • Predictions with model explanations.

2010664b53593e07aa71466abcfb9ada.jpeg

Official website learning address: https://mlbox.readthedocs.io/en/latest/

Python version requirements: Python versions: 3.5 - 3.7 . & 64-bit version only (32-bit Windows systems are no longer supported).

pip install mlbox

Source code based installation:

# linux 或者Macos
git clone git://github.com/AxeldeRomblay/mlbox
cd MLBox
python setup.py install

2. Practical cases

from mlbox.preprocessing import *
from mlbox.optimisation import *
from mlbox.prediction import *

paths = ["<file_1>.csv", "<file_2>.csv", ..., "<file_n>.csv"] 
target_name = "<my_target>" 

data = Reader(sep=",").train_test_split(paths, target_name)  
data = Drift_thresholder().fit_transform(data)

# 评估模型
Optimiser().evaluate(None, data)

# 自定义搜索参数
space = {
        'ne__numerical_strategy' : {"space" : [0, 'mean']},
        'ce__strategy' : {"space" : ["label_encoding", "random_projection", "entity_embedding"]},
        'fs__strategy' : {"space" : ["variance", "rf_feature_importance"]},
        'fs__threshold': {"search" : "choice", "space" : [0.1, 0.2, 0.3]},
        'est__strategy' : {"space" : ["LightGBM"]},
        'est__max_depth' : {"search" : "choice", "space" : [5,6]},
        'est__subsample' : {"search" : "uniform", "space" : [0.6,0.9]}
        }
best = opt.optimise(space, data, max_evals = 5)

# 预测
Predictor().fit_predict(best, data)

5.9Auto-Pytorch

1 Introduction

Auto-PyTorch is an automatic machine learning framework that uses PyTorch to implement automatic search of neural network architectures. Auto-PyTorch can automatically select the best neural network architecture based on the characteristics of the data set, thereby optimizing the performance of the model.

9782a2111af591eadb2f24f456916f63.png

Auto-PyTorch's algorithm automatically searches the neural network's architecture, hyperparameters, etc. to find the model that performs best on a specific data set. This automatic search can be accelerated by GPU computing, greatly reducing the tedious work of manually adjusting model parameters and architecture.

To use Auto-PyTorch, you need to install PyTorch and the Auto-PyTorch library first. Then, you can define training and testing data sets by writing simple Python code and call Auto-PyTorch's API for automatic model training and testing.

GitHub official website address: https://github.com/automl/Auto-PyTorch

Install:

pip install autoPyTorch
# 时序预测相关
pip install autoPyTorch[forecasting]

Manual installation:

# 创建虚拟环境
conda create -n auto-pytorch python=3.8
conda activate auto-pytorch
conda install swig
python setup.py install

2. Practical cases

from autoPyTorch.api.tabular_classification import TabularClassificationTask

# 导入数据
import sklearn.model_selection
import sklearn.datasets
import sklearn.metrics
X, y = sklearn.datasets.load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, random_state=1)

# 实例化分类模型
api = TabularClassificationTask()

# 自动化搜索
api.search(
    X_train=X_train,
    y_train=y_train,
    X_test=X_test,
    y_test=y_test,
    optimize_metric='accuracy',
    total_walltime_limit=300,
    func_eval_time_limit_secs=50
)

# 计算准确率
y_pred = api.predict(X_test)
score = api.score(y_pred, y_test)
print("Accuracy score", score)

6. Algorithm-based hyperparameter tuning

6.1 Algorithm-based hyperparameter tuning

Algorithm-based hyperparameter optimization refers to automatically adjusting hyperparameters by running different algorithms, such as genetic algorithms, particle swarm optimization algorithms, etc., to find the optimal hyperparameter combination. This method uses algorithms to search the hyperparameter space to find the optimal combination of hyperparameters.

Algorithm-based hyperparameter optimization usually requires setting some parameters, such as population size, number of iterations, etc., which will affect the optimization effect. When running the algorithm, the effects of different hyperparameter combinations are evaluated based on certain evaluation criteria, and the optimal hyperparameter combination is gradually searched for.

6.2 Common algorithms used for hyperparameter optimization

  • Bayesian Optimization and HyperBand: A hyperparameter optimization algorithm that combines Bayesian optimization and HyperBand algorithms. The goal of BOHB is to find the optimal combination of hyperparameters within a given budget so that the machine learning model can achieve the best performance on a specific task.

  • Genetic optimization algorithm: Genetic optimization algorithm is a method of searching for optimal solutions by simulating the natural evolution process. It is a global optimization method that can search for the optimal solution within a larger solution space.

  • Gradient optimization algorithm: The gradient optimization algorithm is an optimization algorithm based on gradient descent and is used to solve complex optimization problems. It searches for the optimal solution by iteratively adjusting parameters and minimizing the loss function. In the gradient descent algorithm, at each iteration, the algorithm adjusts the parameters and updates the parameter values ​​based on the current gradient direction. Such as momentum gradient descent, Adam algorithm and L-BFGS algorithm, etc.

  • Population optimization algorithm: The population optimization algorithm is an optimization algorithm based on the principle of biological evolution in nature. It simulates the process of biological evolution and gradually finds the optimal solution through continuous iteration and the process of survival of the fittest. The core idea of ​​the population optimization algorithm is to transform the problem to be optimized into a fitness function, and then gradually find the optimal solution through continuous iteration and the process of survival of the fittest.

6.3Bayesian Optimization and HyperBand(BOHB)

1 Introduction

Bayesian Optimization and HyperBand mixes the Hyperband algorithm and Bayesian optimization. The BOHB algorithm uses the Bayesian optimization algorithm for sampling and evaluates combinations of hyperparameters to find the best combination of hyperparameters. It combines the advantages of global and local search and can quickly find the optimal hyperparameter combination while being robust and scalable.

The BOHB algorithm is suitable for deep learning tasks, and by selecting appropriate hyperparameters, the performance and accuracy of the model can be significantly improved. It works across different models and datasets, and new hyperparameters and constraints can be easily added.

The steps of the BOHB algorithm are as follows:

  1. Initialization: Select the initial range and distribution of hyperparameters, define evaluation metrics and the number of evaluations.

  2. Run the Bayesian optimization algorithm: Based on the initial range and distribution, hyperparameter combinations are generated and evaluated and selected using the Bayesian optimization algorithm.

  3. Resource allocation and model selection based on the HyperBand algorithm: Based on the evaluation results of the Bayesian optimization algorithm, the model is divided into several levels, and different resources are allocated according to the levels.

  4. Repeat steps 2 and 3 until the preset number of assessments or budget is reached.

  5. Output the optimal hyperparameter combination: Select the optimal hyperparameter combination from all evaluation results as the final result.

references:

https://www.automl.org/blog_bohb/

https://neptune.ai/blog/hyperband-and-bohb-understanding-state-of-the-art-hyperparameter-optimization-algorithms

2. Python practical cases

import numpy as np  
from keras.models import Sequential  
from keras.layers import Dense  
from keras.optimizers import Adam  
from keras.losses import BinaryCrossentropy  
from keras.metrics import Accuracy  
from keras.utils import to_categorical  
from sklearn.model_selection import train_test_split  
from bandit import Bandit  
  
# 加载数据集  
data = np.loadtxt('data.csv', delimiter=',')   # data数据集最后一个为预测目标变量,前面所有字段为特征变量
X = data[:, :-1]  
y = to_categorical(data[:, -1])  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  
  
# 定义模型  
model = Sequential()  
model.add(Dense(64, activation='relu', input_shape=(X_train.shape[1],)))  
model.add(Dense(64, activation='relu'))  
model.add(Dense(y_train.shape[1], activation='softmax'))  
model.compile(optimizer=Adam(), loss=BinaryCrossentropy(), metrics=[Accuracy()])  
  
# 定义BOHB优化器  
def bohb(x):  
    model.set_weights(x)  
    history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)  
    loss, accuracy = history.history['val_loss'], history.history['val_accuracy']  
    return loss, accuracy  
  
# 定义Bandit类  
bandit = Bandit(bohb, n_iter=1000, n_restarts=10, noise=0.1)  
  
# 进行优化  
bandit.run()

6.4 Genetic Algorithm

1. What is genetic algorithm?

Genetic optimization algorithm is an optimization algorithm that searches for optimal solutions by simulating the natural evolution process.

This algorithm is mainly inspired by natural selection, crossover (recombination of genetic information) and mutation processes in biological evolution. Genetic optimization algorithms are usually used to solve some complex optimization problems, such as function optimization, combinatorial optimization, parameter optimization in machine learning, etc.

https://cloud.tencent.com/developer/article/1425840

2. Basic steps of genetic algorithm:

  • Initialization : First, the algorithm randomly generates a population that contains possible solutions.

  • Fitness evaluation : Each individual (that is, each solution in the population) has a fitness value corresponding to it. This fitness value represents the excellence of the individual and is usually a measure of the objective function value.

  • Selection : Select a part of the individuals in the population for reproduction based on their fitness values. Individuals with high fitness have a greater chance of being selected.

  • Crossover : The selected individuals generate new individuals through the crossover operation. This process simulates genetic recombination in biological evolution.

  • Mutation : In order to maintain the diversity of the population, certain individuals will be randomly mutated, simulating genetic mutations in biological evolution.

  • Replacement : Replace some individuals in the original population with a newly generated batch of individuals to form a new population.

  • Termination condition : If the termination condition is met (such as the maximum number of iterations is reached or the optimal fitness has reached a certain accuracy), stop the search, otherwise return to step 2.

3. Python practice of genetic algorithm

import numpy as np  
  
# 目标函数  
def f(x):  
    return x ** 2  
  
# 遗传算法优化函数  
def genetic_algorithm(n_pop, n_gen, lower_bound, upper_bound):  
    # 初始化种群  
    pop = np.random.uniform(lower_bound, upper_bound, n_pop)  
    # 计算适应度  
    fit = np.array([f(x) for x in pop])  
    # 进行遗传优化  
    for gen in range(n_gen):  
        # 选择  
        idx = np.random.choice(np.arange(n_pop), size=n_pop, replace=True, p=fit/fit.sum())  
        pop = pop[idx]  
        fit = fit[idx]  
        # 交叉  
        for i in range(0, n_pop, 2):  
            if np.random.rand() < 0.5:  
                pop[i], pop[i+1] = pop[i+1], pop[i]  
            pop[i] = (pop[i] + pop[i+1]) / 2.0  
        # 突变  
        for i in range(n_pop):  
            if np.random.rand() < 0.1:  
                pop[i] += np.random.normal(0, 0.5)  
    # 返回最优解  
    return pop[np.argmin(fit)]  
  
# 参数设置  
n_pop = 100  # 种群大小  
n_gen = 100  # 迭代次数  
lower_bound = -10  # 下界  
upper_bound = 10  # 上界  
  
# 进行遗传优化  
best_x = genetic_algorithm(n_pop, n_gen, lower_bound, upper_bound)  
print('最优解:x = %.3f, f(x) = %.3f' % (best_x, f(best_x)))
  • The objective function f(x) = x**2 is defined, and then a genetic algorithm optimization function is implemented

  • Generate a uniformly distributed random population based on the numpy library and calculate the fitness of each individual

  • In an iterative process, new populations are generated using roulette selection, arithmetic crossover, and random mutation operations. Finally, return the individual with the smallest fitness, which is the optimal solution

6.5 Gradient Optimization Algorithm

1. What is gradient optimization?

Gradient-based optimization is a method that utilizes gradient information for optimization. The basic principle is to use the gradient information of the objective function to gradually and iteratively update the parameters to find the minimum (or maximum) value of the objective function.

Specifically, the Gradient Optimization Algorithm determines the direction and step size of parameter update by calculating the gradient of the objective function, that is, the tangent slope of the function at a certain point . In each iteration, the algorithm updates parameters based on gradient information so that the value of the objective function is updated in the opposite direction of the gradient. Through such an iterative process, the gradient optimization algorithm can gradually approach the minimum point of the objective function.

Gradient optimization algorithms are often combined with other optimization techniques, such as momentum method, learning rate annealing, etc.

2. Practical python implementation of gradient optimization algorithm

import numpy as np  
  
# 加载数据集  
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])  
y = np.array([0, 1, 1, 0])  
  
# 参数初始化  
theta = np.random.randn(2, 1)  
alpha = 0.1  # 学习率  
iters = 1000  
  
# 梯度下降算法  
for i in range(iters):  
    # 计算预测值和误差  
    y_pred = np.dot(X, theta) >= 0.5  
    error = y - y_pred  
    # 计算梯度  
    gradient = np.dot(X.T, error) / len(X) + alpha * theta  
    # 更新参数  
    theta -= gradient  
  
# 输出结果  
print('最优解:theta =', theta)

6.6 Population-based Optimization Algorithm

1. Principle of population optimization algorithm

The principle of Population-Based Optimization Algorithms (POAs) is based on group wisdom and group evolution rules. It solves complex optimization problems through algorithms that simulate the biological evolution process. In the population optimization algorithm, each individual represents a possible solution, and the entire population represents all possible solutions. Each individual has a fitness value that indicates how good it is in the optimization problem.

The core idea of ​​the algorithm is to transform the problem to be optimized into a fitness function, and then gradually find the optimal solution through continuous iteration and the process of survival of the fittest. In each iteration, each individual in the population is sorted according to its fitness value, and then selected, crossed, and mutated according to a certain probability to generate new individuals. This process is repeated until the optimal solution is found or a preset number of iterations is reached.

The population optimization algorithm has the advantages of strong global search ability, less dependence on problems, and strong robustness, and is suitable for solving complex nonlinear optimization problems. Common population optimization algorithms include genetic algorithm, particle swarm optimization algorithm, ant colony algorithm, etc.

2. Practical python implementation of population optimization algorithm

import numpy as np  
  
# 适应度函数  
def fitness(x):  
    return x ** 2  
  
# 初始化种群  
population_size = 100  
gene_length = 10  
population = np.random.randint(2, size=(population_size, 基因长度))  
  
# 迭代进化  
iters = 100    # 迭代次数
cross_entropy = 0.8   # 交叉概率
mutation_probability = 0.01    # 变异概率
  
for i in range(iters):  
    # 选择  
    selection_probability = fitness(population) / sum(fitness(population))  
    selected = np.random.choice(np.arange(population_size), size=population_size, p=selection_probability)  
    population = population[selected]  
  
    # 交叉  
    for j in range(population_size):  
        if np.random.rand() < cross_entropy:  
            pos = np.random.randint(gene_length)  
            population[j], population[j+1] = np.roll(population[j], pos), np.roll(population[j+1], pos)  
  
    # 变异  
    for j in range(population_size):  
        if np.random.rand() < mutation_probability:  
            pos = np.random.randint(gene_length)  
            population[j][pos] = 1 - population[j][pos]  
  
    # 计算适应度值并排序  
    fitness_value = np.array([fitness(x) for x in population])  
    fitness_value_index = np.argsort(fitness_value)[::-1]  
    population = population[fitness_value_index]  
  
# 输出最优解  
best_solution_index = np.argmax(fitness_value)  
best_solution = population[best_solution_index]  
print('最优解:',best_solution)

a9003588de4cfbcb9cba292cbf3709ae.png

Good stuff to learn, like three times in a row

Guess you like

Origin blog.csdn.net/Datawhale/article/details/132843650