自定义类转pmml
sklearn中,继承TransformerMixin实现自定义类放入pipeline,sklearn2pmml生成pmml报错,说是不支持自定义的转换类
import lightgbm as lgb
import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.linear_model import LogisticRegression
from sklearn2pmml import PMMLPipeline
# transform类实现
class GbdtEncoder(BaseEstimator, TransformerMixin):
def __init__(self):
super().__init__()
def fit(self, X, y=None):
return self # nothing else to do
def transform(self, X, y=None):
num_leaf = 30
gbdt_model = lgb.LGBMClassifier(boosting_type='gbdt', num_leaves=num_leaf, reg_alpha=0.0, reg_lambda=1,
max_depth=6, n_estimators=30, objective='binary',
learning_rate=0.06, random_state=20, n_jobs=4)
#print(X)
y = X['success']
X = X[NUMERIC_COLS]
gbdt_model.fit(X, y)
y_pred = gbdt_model.predict(X, pred_leaf=True)
transformed_training_matrix = np.zeros([len(y_pred), (len(y_pred[0]) * num_leaf)],
dtype=np.int64) # N * num_tress * num_leafs
for i in range(0, len(y_pred)):
temp = np.arange(len(y_pred[0])) * num_leaf + np.array(y_pred[i])
transformed_training_matrix[i][temp] += 1
#print(transformed_training_matrix)
return transformed_training_matrix
pipeline = PMMLPipeline([
('gbdt_encoder', GbdtEncoder()),
('classifier',LogisticRegression(penalty='l2', C=0.05))
])
X = pd.concat([df[NUMERIC_COLS], df['success']], axis=1)
pipeline.fit(X,df['success'])
today_real = time.strftime('%Y%m%d', time.localtime(time.time()))
print('today_real:', today_real)
save_path = './order_sort'
file_name = 'look_order_cross_city_new.pmml'
sklearn2pmml.sklearn2pmml(pipeline, os.path.join(save_path, file_name))
报错
java.lang.IllegalArgumentException: The transformer object (Python class __main__.GbdtEncoder) is not a supported Transformer at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43) at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:121) at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:112) at com.google.common.collect.Lists$TransformingRandomAccessList.get(Lists.java:599) at sklearn.TransformerUtil.getHead(TransformerUtil.java:35) at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:189) at org.jpmml.sklearn.Main.run(Main.java:145) at org.jpmml.sklearn.Main.main(Main.java:94) Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer at java.lang.Class.cast(Class.java:3369) at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41)
skearn 自定义转换器 [TransformerMixin, BaseEstimator, fit_transform, fit, transform]
https://blog.csdn.net/ITW_633/article/details/83414898
https://blog.csdn.net/weixin_32549789/article/details/84937921
自定义函数用于PMMLPipeline中
https://blog.csdn.net/weixin_38569817/article/details/87810658
Sklearn中Pipeline的使用
https://www.jianshu.com/p/9c2c8c8ef42d
lightgbm模型通过pmml存储,在java中调用
https://blog.csdn.net/luoyexuge/article/details/80087952
Python模型上线 - 对sklearn2pmml转换自定义函数的探索
https://blog.csdn.net/weixin_42835182/article/details/82218765