PaddleDTX supports one-click pull up of the test network through docker compose. The official website gives the environmental requirements:
-
docker, recommended version 18.03+
-
docker-compose, recommended version 1.26.0+
-
If using a Mac to start the service, Docker Desktop is set to at least 4GB of runtime memory
Service start and stop
1. The one-click startup script is in the PaddleDTX/scripts directory, and the specific startup steps are as follows:
$ git clone git@github.com:PaddlePaddle/PaddleDTX.git
$ cd PaddleDTX/scripts
$ sh network_up.sh start
Support starting DAI based on the Fabric blockchain network, the command is as follows:
$ sh network_up.sh start -b fabric
Support starting DAI based on ipfs storage network, the command is as follows:
$ sh network_up.sh start -s ipfs
Support three-party DNN algorithm, the command is as follows:
$ sh network_up.sh start -p true
Interface visualization is supported. When starting paddledtx-visual, you should specify the computing node IP that can be accessed by the browser. For example, to start PaddleDTX and its visualization service on the machine whose host is "106.13.169.234", the command is as follows:
$ ./network_up.sh start -h 106.13.169.234
After the visualization service is started, the browser can enter http://106.13.169.234:8233/ to access it. For the node configuration before use, refer to the network startup instructions
2. Use scripts to quickly destroy the network:
$ sh network_up.sh stop
Destroy the Fabric-based DAI network:
$ sh network_up.sh stop -b fabric
After the network is started, when PaddleDTX starts successfully ! is printed, it means that the startup is successful, and users can quickly experience PaddleDTX through the ./paddledtx_test.sh script.
Sample upload and task release
PaddleDTX currently supports longitudinal linear regression, longitudinal logistic regression, and longitudinal neural network. The following uses multiple linear regression as an example to demonstrate the training and forecasting task release process.
1. Upload training and prediction sample files
$ ./paddledtx_test.sh upload_sample_files
This command uploaded a total of 14 files, including 8 sample files required by data holder A/B to release longitudinal linear regression, longitudinal logistic regression training and prediction tasks, and data holder A/B/C to release longitudinal deep neural 6 sample files required for network training and prediction tasks. The sample files are stored in xdb with double copy backups.
2. Start the vertical linear regression training task, and the value of vlLinTrainfiles is the Vertical linear train sample files obtained in step 1
# 不启动模型评估
$ ./paddledtx_test.sh start_vl_linear_train -f $vlLinTrainfiles
# 启动模型评估
$ ./paddledtx_test.sh start_vl_linear_train -f $vlLinTrainfiles -e true
# 启动动态模型评估
$ ./paddledtx_test.sh start_vl_linear_train -f $vlLinTrainfiles -l true
After the task is released successfully, the training task ID will be returned. By default, model evaluation adopts k-fold cross-validation evaluation method, and dynamic model evaluation adopts random partition evaluation method.
3. Start the vertical linear regression prediction task. The value of vlLinPredictfiles is the Vertical linear prediction sample files obtained in step 1, and the value of linearModelTaskId is the model training task ID returned in step 2. Before releasing the prediction task, please ensure that step 2 has been trained
$ ./paddledtx_test.sh start_vl_linear_predict -f $vlLinPredictfiles -m $linearModelTaskId
The calculation of the prediction task is fast, and the prediction result will be downloaded automatically after the calculation is completed, and the RMSE of the prediction effect index will be calculated.
4. The user can view the task status, taskID is the calculation task ID
$ ./paddledtx_test.sh gettaskbyid -i $taskID
The operation of longitudinal logistic regression and neural network is similar, so I won't repeat them here.