Spark机器学习工具链-MLflow

本文翻译自 https://github.com/openthings/mlflow
本文地址 https://my.oschina.net/u/2306127/blog/1825638， by openthings, 2018.06.07.
mlflow项目由Databricks创建。
- 官方主页 https://www.mlflow.org/
- 官方文档 https://www.mlflow.org/docs/latest/index.html

MLflow 说明与文档

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles three primary functions:

Tracking experiments to record and compare parameters and results (MLflow Tracking).
Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production (MLflow Projects).
Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).

MLflow is library-agnostic. You can use it with any machine learning library, and in any programming language, since all functions are accessible through a REST API and CLI. For convenience, the project also includes a Python API.

Get started using the Quickstart or by reading about the key concepts.

快速开始-MLflow Alpha版本

⚠️注意

目前的MLflow版本是alpha阶段，意味着 APIs 和存储格式都有可能随时改变!

安装

Install MLflow from PyPi via pip install mlflow

MLflow requires conda to be on the PATH for the projects feature.

Documentation

Official documentation for MLflow can be found at https://mlflow.org/docs/latest/index.html.

Running a Sample App With the Tracking API

The programs in example use the MLflow Tracking API. For instance, run:

python example/quickstart/test.py

This program will use MLflow log API, which stores tracking data in ./mlruns, which can then be viewed with the Tracking UI.

Launching the Tracking UI

The MLflow Tracking UI will show runs logged in ./mlruns at http://localhost:5000. Start it with:

mlflow ui

Running a Project from a URI

The mlflow run command lets you run a project packaged with a MLproject file from a local path or a Git URI:

mlflow run example/tutorial -P alpha=0.4

mlflow run [email protected]:databricks/mlflow-example.git -P alpha=0.4

See example/tutorial for a sample project with an MLproject file.

Saving and Serving Models

To illustrate managing models, the mlflow.sklearn package can log Scikit-learn models as MLflow artifacts and then load them again for serving. There is an example training application in example/quickstart/test_sklearn.py that you can run as follows:

$ python example/quickstart/test_sklearn.py
Score: 0.666
Model saved in run <run-id>

$ mlflow sklearn serve -r <run-id> model

$ curl -d '[{"x": 1}, {"x": -1}]' -H 'Content-Type: application/json' -X POST localhost:5000/invocations

Contributing

We happily welcome contributions, please see our contribution guide for details.