Spark机器学习工具链-MLflow
- 本文翻译自 https://github.com/openthings/mlflow
- 本文地址 https://my.oschina.net/u/2306127/blog/1825638, by openthings, 2018.06.07.
- mlflow项目由Databricks创建。
MLflow 说明与文档
MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles three primary functions:
- Tracking experiments to record and compare parameters and results (MLflow Tracking).
- Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production (MLflow Projects).
- Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).
MLflow is library-agnostic. You can use it with any machine learning library, and in any programming language, since all functions are accessible through a REST API and CLI. For convenience, the project also includes a Python API.
Get started using the Quickstart or by reading about the key concepts.
- Quickstart
- Tutorial
- Concepts
- MLflow Tracking
- MLflow Projects
- MLflow Models
- Command-Line Interface
- Python API
- REST API
快速开始-MLflow Alpha版本
⚠️注意
目前的MLflow版本是alpha阶段,意味着 APIs 和存储格式都有可能随时改变!
安装
Install MLflow from PyPi via pip install mlflow
MLflow requires conda
to be on the PATH
for the projects feature.
Documentation
Official documentation for MLflow can be found at https://mlflow.org/docs/latest/index.html.
Running a Sample App With the Tracking API
The programs in example
use the MLflow Tracking API. For instance, run:
python example/quickstart/test.py
This program will use MLflow log API, which stores tracking data in ./mlruns
, which can then be viewed with the Tracking UI.
Launching the Tracking UI
The MLflow Tracking UI will show runs logged in ./mlruns
at http://localhost:5000. Start it with:
mlflow ui
Running a Project from a URI
The mlflow run
command lets you run a project packaged with a MLproject file from a local path or a Git URI:
mlflow run example/tutorial -P alpha=0.4 mlflow run [email protected]:databricks/mlflow-example.git -P alpha=0.4
See example/tutorial
for a sample project with an MLproject file.
Saving and Serving Models
To illustrate managing models, the mlflow.sklearn
package can log Scikit-learn models as MLflow artifacts and then load them again for serving. There is an example training application in example/quickstart/test_sklearn.py
that you can run as follows:
$ python example/quickstart/test_sklearn.py Score: 0.666 Model saved in run <run-id> $ mlflow sklearn serve -r <run-id> model $ curl -d '[{"x": 1}, {"x": -1}]' -H 'Content-Type: application/json' -X POST localhost:5000/invocations
Contributing
We happily welcome contributions, please see our contribution guide for details.