pmml文件模型,Java调用

自定义类转pmml

sklearn中,继承TransformerMixin实现自定义类放入pipeline,sklearn2pmml生成pmml报错,说是不支持自定义的转换类

import lightgbm as lgb
import numpy as  np
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.linear_model import LogisticRegression
from sklearn2pmml import PMMLPipeline


# transform类实现
class GbdtEncoder(BaseEstimator, TransformerMixin):
    def __init__(self):
        super().__init__()

    def fit(self, X, y=None):
        return self  # nothing else to do

    def transform(self, X, y=None):
        num_leaf = 30
        gbdt_model = lgb.LGBMClassifier(boosting_type='gbdt', num_leaves=num_leaf, reg_alpha=0.0, reg_lambda=1,
                                        max_depth=6, n_estimators=30, objective='binary',
                                        learning_rate=0.06, random_state=20, n_jobs=4)
        #print(X)
        y = X['success']
        X = X[NUMERIC_COLS]
        gbdt_model.fit(X, y)
        y_pred = gbdt_model.predict(X, pred_leaf=True)

        transformed_training_matrix = np.zeros([len(y_pred), (len(y_pred[0]) * num_leaf)],
                                               dtype=np.int64)  # N * num_tress * num_leafs
        for i in range(0, len(y_pred)):
            temp = np.arange(len(y_pred[0])) * num_leaf + np.array(y_pred[i])
            transformed_training_matrix[i][temp] += 1
        #print(transformed_training_matrix)

        return transformed_training_matrix

pipeline = PMMLPipeline([
    ('gbdt_encoder', GbdtEncoder()),
    ('classifier',LogisticRegression(penalty='l2', C=0.05))
])
X = pd.concat([df[NUMERIC_COLS], df['success']], axis=1)
pipeline.fit(X,df['success'])
today_real = time.strftime('%Y%m%d', time.localtime(time.time()))
print('today_real:', today_real)
save_path = './order_sort'
file_name = 'look_order_cross_city_new.pmml'
sklearn2pmml.sklearn2pmml(pipeline, os.path.join(save_path, file_name))

报错

java.lang.IllegalArgumentException: The transformer object (Python class __main__.GbdtEncoder) is not a supported Transformer
	at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
	at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:121)
	at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:112)
	at com.google.common.collect.Lists$TransformingRandomAccessList.get(Lists.java:599)
	at sklearn.TransformerUtil.getHead(TransformerUtil.java:35)
	at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:189)
	at org.jpmml.sklearn.Main.run(Main.java:145)
	at org.jpmml.sklearn.Main.main(Main.java:94)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
	at java.lang.Class.cast(Class.java:3369)
	at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41)

skearn 自定义转换器 [TransformerMixin, BaseEstimator, fit_transform, fit, transform]

https://blog.csdn.net/ITW_633/article/details/83414898

https://blog.csdn.net/weixin_32549789/article/details/84937921

自定义函数用于PMMLPipeline中

https://blog.csdn.net/weixin_38569817/article/details/87810658

Sklearn中Pipeline的使用

https://www.jianshu.com/p/9c2c8c8ef42d

 

 

 

lightgbm模型通过pmml存储,在java中调用

https://blog.csdn.net/luoyexuge/article/details/80087952

Python模型上线 - 对sklearn2pmml转换自定义函数的探索

https://blog.csdn.net/weixin_42835182/article/details/82218765

记录pyspark ml与pmml的用法(整个流程学习下)

https://yuerblog.cc/2018/07/17/pyspark-ml-and-pmml-usage/

发布了59 篇原创文章 · 获赞 11 · 访问量 2万+

猜你喜欢

转载自blog.csdn.net/u013385018/article/details/103132605