DeepCTR v0.7.1 Release Notes

hello Hello everyone,

Since deepctr v0.7.0 update after the end of November last year due to some personal reasons could not timely follow-up questions and answers related to friends in the github issue area, DeepCTR exchange group and submitted by mail, here to say sorry hoping no impact on friends, study and work.

Holiday home lie flat venting at home several days later, and finally in the mood to turn on the computer to change to change the code. This article mainly introduce the new version v0.7.1 is mainly made what changes ~

The main features and improvements

WeightedSequenceLayer parameters weight_normalization not exposed to the outside API

  • Problem Description
    https://github.com/shenweichen/DeepCTR/issues/163

  • problem analysis

The problem is in v0.6.3 we support the sequence type with weight characteristics.

For heavy fraction rights, when we realize that the support of the weights are normalized operations, and user when in actual use can not be aware of this feature, default is not normalized model, it requires the user to modify the source code to make WeightedSequenceLayerIt is weight_normalization=Trueto enable the realization of the weight fraction normalized pattern.

  • solution

In VarLenSparseFeatadding the weight_normparameters, the user defined time weighted sequence characteristics, by providing weight_normto Trueor Falseto control whether the weighted scores normalized by default True.

Bug fixes

embedding dimension sparse features discrete linear feature column is not one model will lead to loss of memory

  • Problem Description
    https://github.com/shenweichen/DeepCTR/issues/178

  • problem analysis

The problem is in v0.7.0 version we support a different embedding dimensions for different feature sets introduced.

In v0.7.0 previous versions, model support linear_feature_columnsand dnn_feature_columns, respectively, represented by memory and representatives of the wide side of the deep side generalization. wide side SparseFeatmodel which automatically sets an embedding dimension to simulate onehot memory characteristics, deep side SparseFeatthrough the model embedding_sizeparameters to control all features of an embedding dimension groups.

在v0.7.0中,特征组的embedding维度需要在定义特征列时使用SparseFeatembedding_dim参数设置,该参数默认为4。

若用户没有指定输入进linear_feature_columnsSparseFeatembedding_dim=1,则会导致模型的wide侧失去记忆性。

  • 解决方案

在获取wide侧logit的方法get_linear_logit中,强制覆盖SparseFeatembedding_dim=1。换言之,输入进linear_feature_columnsSparseFeatembedding_dim失效,会被模型强制设置为1。

版本检查中抛出的异常用户难以理解,影响后续使用

  • 问题描述
    https://github.com/shenweichen/DeepCTR/issues/176

  • 问题分析

在一些无法连接网络或者pip配置有过修改的机器上,deepctr的版本检查会抛出大段用户难以理解的异常。

  • 解决方案

版本检查在出现错误时提示用户访问deepctr网站进行人工版本检查,不再提示出错和异常信息。

API变化

deepctr.layers.sequence.WeightedSequenceLayer

WeightedSequenceLayer中的weight_normalization默认值变为True

  • 旧:deepctr.layers.sequence.WeightedSequenceLayer(weight_normalization=False, supports_masking=False)

  • 新:deepctr.layers.sequence.WeightedSequenceLayer(weight_normalization=True, supports_masking=False)

deepctr.inputs.VarLenSparseFeat

由于VarLenSparseFeatSparseFeat存在较多相同参数,且很多情况下相同参数的取值也是相同的(如用户历史商品点击序列和待预估商品),故将VarLenSparseFeat的初始化参数更改为由一个SparseFeat的实例和其他序列相关的参数组成。

对用户而言,只需要理解SparseFeat的参数含义以及一些序列相关的参数含义就可以使用VarLenSparseFeat

  • 旧:VarLenSparseFeat(name, maxlen, vocabulary_size, embedding_dim=4, combiner="mean", use_hash=False, dtype="float32", length_name=None, weight_name=None, embedding_name=None, group_name=DEFAULT_GROUP_NAME)

  • 新:VarLenSparseFeat(sparsefeat, maxlen, combiner="mean", length_name=None, weight_name=None, weight_norm=True)

以上就是本次更新的说明,快使用命令pip install -U deepctr更新吧!希望朋友们能够多多支持,多多提意见!谢谢!

Published 107 original articles · won praise 58 · Views 230,000 +

Guess you like

Origin blog.csdn.net/u012151283/article/details/104102846