How to re-training machine learning models on SAP Leonardo

Jerry previous two articles describes how through Restful API, the consumption of machine learning model pre-trained on SAP Leonardo:

Jerry was mentioned, Product Image Classification API only supports 29 kinds of product categories:

 

The need to support additional product categories if we develop applications, we need to provide their own picture of the product category to retrain the model.

 

Here are the steps to retrain SAP Leonardo on machine learning model.

After assuming we expect to retrain, Product Image Classfication this model is able to identify the different types of flowers, we first have to get a lot of pictures of flowers. Tensorflow's official website, has kindly given to learners who want to model training, providing a compression bag to do with the practice, which contains a large variety of pictures of flowers.

http://download.tensorflow.org/example_images/flower_photos.tgz

SAP Leonardo accepted can be used to retrain the model data set, you must comply with the hierarchy shown below, namely training, validation and test the following three folders, each containing the product category named subfolders, and the size of the data ratio of 8: 1: 1.

 

Once you have the data for training, the next step is to upload the data to the SAP Leonardo's model of online storage platform.

Jerry previous article  deployed in how to use SAP Cloud Platform CloudFoundry consumer environment  has been described how to create a service instance Leonardo SAP cloud machines on the platform of learning, service key in this instance contains a IMAGE_RETRAIN_API_URL, it can be used to get online stored url:

 

This sends a HTTP get request url, url get online store:

 

把这个url粘贴到浏览器里,输入postman里返回的accessKey和secretKey登录,就能以web的方式访问这个在线存储了:

 

下一步是把本地的训练文件上传到这个部署在AWS上的在线存储上去。

首先用命令行mc config host定义一个名为sapjerrys3的远程站点,将上一步从postman获得的AWS在线存储url,accessKey和secret绑定到这个站点上:

 

然后使用命令行上传文件:

mc.exe cp -r C:\Code\MachineLearningStudy\flowersjerry sapjerrys3\data

大概十几分钟后,文件上传完毕:

 

此时可以从浏览器里看到AWS在线存储上传完毕的训练文件。

 

现在可以提交一个后台作业了,让Leonardo去处理这些上传好的文件,ABAP顾问们可以把这个动作理解成在Netweaver事务码SM36里定义一个后台作业并提交。发送一个HTTP post请求,除了下图jobName, dataset和modelName需要自己维护外,其他字段都使用SAP官网上定义的默认值。

 

这个请求会返回一个后台作业ID,抄下来后把它拼到url末尾,然后重新发送一个HTTP get请求,即可查询到这个作业的执行情况。Jerry重新训练的时候,等待了大概五分钟,作业状态就变为SUCCEEDED了。

 

因为上一篇和本文做的练习都是在SAP Cloud Platform的CloudFoundry环境中进行的,因此我们也可以用cf命令行来查询这些作业的执行情况:

cf sapml retraining jobs -m image

 

如果遇到作业状态为FAILED的情况,去AWS在线存储上查看以作业名称命名的文件夹,里面包含了详细的训练日志,可以用作错误分析:

 

在这个训练好的模型能正式被使用之前,我们还需要对其进行部署,类似ABAP Netweaver里的“激活”动作。

 

和提交训练的后台作业类似,模型部署也是一个异步执行的步骤,提交部署请求后,得到一个部署作业ID:ms-26c5a22c-6d07-4164-8222-a4182969162d

 

根据这个部署作业ID可以查询模型部署状态:

 

成功部署后,我们就可以用Restful API消费这个模型了,url的格式为:

https://mlfinternalproduction-image-classifier.cfapps.sap.hana.ondemand.com/api/v2/image/classification/models/<model name>/versions/1

我从网络上随便找一张向日葵的照片,

 

将这张图片作为HTTP POST的参数发给我重新训练并且部署好的模型flowerjerrymodel,得到的结果显示,重新训练后的模型认为这张图片有大约87%的可能性是代表向日葵。

 

接下来如果有时间的话,Jerry打算搜集一些异形(Alien)的图片来训练,

 
 

看SAP Leonardo能不能把我桌上挂着的这些异形吊饰识别出来。感谢阅读。

 

Guess you like

Origin jerrywang-sap.iteye.com/blog/2442950