令人抓狂的配置

如果用hyperopt搜索一些算法的超参空间，会发现很难写配置。以svm为例，kernel可以取linear,rbf等，不同的核函数又对应不同的参数，如rbf需要再配置gamma，而poly更是要配置gamma,coef0,degree三个参数。gamma参数又可以是auto或者程序员指定的数值。是不是脑壳痛？
图1: svm参数间的依赖关系
脑壳痛不要紧，代码慢慢编呗。于是笔者依照hyperopt的文档写出了如下的超参空间的配置：

svm超参空间配置代码

space = {
    'C': hp.uniform('C', 0.001, 1000),
    'shrinking': hp.choice('shrinking', [True, False]),
    'kernel': hp.choice('kernel', [
        {
            'name': 'rbf',
            'gamma': hp.choice('rbf_gamma', ['auto', hp.uniform('rbf_gamma_uniform',0.0001, 8)])
        },
        {
            'name': 'linear',
        },
        {
            'name': 'sigmoid',
            'gamma': hp.choice('sigmoid_gamma', ['auto', hp.uniform('sigmoid_gamma_uniform',0.0001, 8)]),
            'coef0': hp.uniform('sigmoid_coef0', 0, 10)
        },
        {
            'name': 'poly',
            'gamma': hp.choice('poly_gamma', ['auto', hp.uniform('poly_gamma_uniform',0.0001, 8)]),
            'coef0': hp.uniform('poly_coef0', 0, 10),
            'degree': hp.uniformint('poly_degree', 1, 5),
        }
    ])
}

主要费劲的地方有两个。

hyperopt的超参需要申明label，并且label相当于是超参的变量名，是不可以重复的。所以笔者要费劲地给不同核函数参数的gamma变量命名。想什么rbf_gamma啦，sigmoid_gamma啦，真是令人头秃。
第二就是在写优化目标函数的时候，需要程序员硬编码一下。如下：

svm目标函数代码

def svm_from_cfg(cfg):
    kernel_cfg=cfg.pop('kernel')
    kernel=kernel_cfg.pop('name')
    cfg.update(kernel_cfg)
    cfg['kernel']=kernel
    clf=svm.SVC(**cfg)
    scores = cross_val_score(clf, iris.data, iris.target, cv=5)
    return {
        'loss':1 - np.mean(scores),
        'status':STATUS_OK
    }

起来，不愿写配置的人们

毕竟程序员的使命就是让繁琐的工作自动化，于是我决定写一个程序终结这个繁琐的过程。
重新审视这个问题，我们发现，我们需要优化的参数不过是estimator类的parameter，而如图1所示的依赖关系不过是一种条件依赖，我们称之为condition_option。我们再将estimator抽象出来，称之为estimator_option。于是从top level上写两个类，其他所有的参数配置都当做kwargs的传参。
经过一晚上的施工，我写好了代码。现在我们可以这样配置我们的搜索空间了：

space = estimator_option(
    svm.SVC,
    C=uniform(0.001, 1000),
    shrinking=choice([True, False]),
    kernel=choice([
        condition_option(
            'rbf',
            gamma=choice(['auto', uniform(0.0001, 8)])
        ),
        condition_option(
            'linear'
        ),
        condition_option(
            'sigmoid',
            gamma=choice(['auto', uniform(0.0001, 8)]),
            coef0=uniform(0, 10)
        ),
        condition_option(
            'poly',
            gamma=choice(['auto', uniform(0.0001, 8)]),
            coef0=uniform(0, 10),
            degree=uniformint(1, 5)
        )
    ])
)

与之前的代码相比，是不是感觉清爽多了？
并且也不用再硬编码每种estimator的优化目标函数了，程序会自动地去做这件事。
详细代码见：
https://github.com/TQCAI/hyperopt-simple-cfg/blob/master/simple-cfg.py

程序细节

首先，定义一组代理类，用来代理hyperopt.hp下对应的函数。例如uniform代理hp.uniform，方便后续动态配置label。

class FacadeMixin():
    def get_kwargs(self):
        raise NotImplementedError()

    def get_function(self):
        if 'hyperopt' not in locals():
            hyperopt = import_module('hyperopt')
        fname = f'hyperopt.hp.{self.__class__.__name__}'
        return eval(fname)


class uniform(namedtuple('uniform', ['low', 'high']), FacadeMixin):
    def get_kwargs(self):
        return {'low': self.low, 'high': self.high}


class choice(namedtuple('choice', ['options']), FacadeMixin):
    def get_kwargs(self):
        return {'options': self.options}


class uniformint(namedtuple('uniformint', ['low', 'high']), FacadeMixin):
    def get_kwargs(self):
        return {'low': self.low, 'high': self.high}

写一个condition_option类，继承字典类dict

class condition_option(dict):
    def __init__(self, name, **kwargs):
        super(condition_option, self).__init__()
        self.name = name
        self.update(kwargs)
        self.update({f'condition': name})

    def __str__(self):
        return f'condition_option({super().__str__()})'

    def __repr__(self):
        return f'condition_option({super().__repr__()})'

接下来就是最为复杂的递归解析了：
在递归过程中，会动态地配置每个超参的label

def __get_prefix(prefix, name):
    if not prefix:
        return str(name)
    else:
        return f'{prefix}_{name}'

def to_hp(x: Union[dict, list, tuple, FacadeMixin], prefix=''):
    if isinstance(x, dict):
        for k, v in x.items():
            cur_prefix = __get_prefix(prefix, k)
            if isinstance(v, (FacadeMixin, dict, list, tuple)):
                x[k] = to_hp(v, cur_prefix)
        return x
    elif isinstance(x, FacadeMixin):
        kwargs: dict = x.get_kwargs()
        for sk, sv in kwargs.items():
            if isinstance(sv, (FacadeMixin, dict, list, tuple)):  # such as options arguments maybe
                cur_prefix = prefix
                if isinstance(sv, FacadeMixin):
                    cur_prefix = __get_prefix(prefix, sv.__class__.__name__)
                sv = to_hp(sv, cur_prefix)
            kwargs[sk] = sv
        values = list(kwargs.values())
        return x.get_function()(prefix, *values)
    elif isinstance(x, (tuple, list)):
        cls = x.__class__
        lst = []
        for ix, ele in enumerate(x):
            if isinstance(ele, (FacadeMixin, dict, list, tuple)):
                cur_prefix = __get_prefix(prefix, ix)
                ele = to_hp(ele, cur_prefix)
            lst.append(ele)
        return cls(lst)
    else:
        raise ValueError(f'Invalid type ({type(x)}) in recursion ')

再写一个estimator_option类，用来代理estimator（估计器，即sklearn的SVC以及诸如此类的东西）

class estimator_option(dict):
    def __init__(self, estimator, **kwargs):
        super(estimator_option, self).__init__()
        self.estimator = estimator
        self.update(kwargs)
        self.update({'estimator': estimator})

    def __str__(self):
        return f'estimator_option({super().__str__()})'

    def __repr__(self):
        return f'estimator_option({super().__repr__()})'

优化目标函数输入一组超参，通过这个超参构造一个estimator，并对数据进行交叉验证，用验证集上的平均误差作为我们的优化目标。
在这里，会将condition_option中的其他参数更新到estimator的构造器中。


def estimator_from_cfg(cfg: estimator_option):
    cfg_ = deepcopy(cfg)
    estimator = cfg_['estimator']
    cfg_.pop('estimator')
    for k, v in deepcopy(cfg).items():
        if isinstance(v, dict) and 'condition' in v.keys():
            value_name = v['condition']
            v.pop('condition')
            key_name = k
            cfg_.pop(k)
            cfg_.update(v)
            cfg_.update({key_name: value_name})
    try:
        clf = estimator(**cfg_)
        scores = cross_val_score(clf, iris.data, iris.target, cv=5)
    except:
        print('fail')
        return {'loss': np.inf, 'status': STATUS_FAIL}
    return {
        'loss': 1 - np.mean(scores),
        'status': STATUS_OK,
        'cfg': cfg
    }

将用户自定义的space用to_hp函数翻译为hyperopt可识别的变量sp，大功告成！

sp = to_hp(space)
trials = Trials()
best = fmin(estimator_from_cfg,
            space=sp,
            algo=tpe.suggest,
            max_evals=2000,
            trials=trials)
print(best)

搜索结果：

100%|██████████| 2000/2000 [01:15<00:00, 26.43trial/s, best loss: 0.006666666666666599]
{‘C’: 0.8490833723228467, ‘kernel’: 1, ‘shrinking’: 0}

代码就在https://github.com/TQCAI/hyperopt-simple-cfg，star fork fellow走一波，欢迎issue。

欢迎关注公众号"人工智能源码阅读"(aicodereview)
欢迎关注公众号"人工智能源码阅读"

数学工具构造器

发布了298 篇原创文章 · 获赞 36 · 访问量 7万+

私信关注

简化hyperopt参数空间配置

文章目录

令人抓狂的配置

起来，不愿写配置的人们

程序细节

猜你喜欢