LinkedIn open source Java machine learning library Dagli, optimized for JVM

LinkedIn announced the open source  Dagli , Dagli is an open source machine learning library for Java (and other JVM languages). Its development team stated that it can easily write model pipelines that are not prone to error, readable, modifiable, maintainable, and easy to deploy. , Without causing technical debt. Dagli makes full use of modern multi-core CPUs and increasingly powerful GPUs to perform effective stand-alone training of real-world models.

LinkedIn mentioned that more and more excellent machine learning tools have been born in recent years, such as TensorFlow, PyTorch, DeepLearning4J and CNTK for neural networks; Spark and Kubeflow for extremely large-scale data pipelines, as well as various common models Scikit-learn, ML.NET and most recently Tribuo.

However, models are usually part of an integrated pipeline (including feature transformers), and building, training, and deploying these pipelines to the production environment is still more cumbersome than the original. In order to adapt to training and reasoning, repetitive or redundant work is often required to produce "glue" code, which leads to complicated future evolution and maintenance of the model, and a long-term technical burden.

This is why LinkedIn launched Dagli. They hope that Dagli can solve the problem of technical burden caused by the model.

LinkedIn believes that both experienced machine learning engineers and developers who are new to machine learning can use Dagli to develop machine learning models. For senior machine learning engineers, Dagli provides an easy way to develop a model that is efficient and suitable for production environments. The model can be maintained for a long time and can be extended when needed, and can be integrated with the current JVM-based technology stack.

For engineers who are new to machine learning, Dagli provides intuitive and easy-to-use APIs that can be used in conjunction with familiar JVM tools to avoid common logic errors. Dagli represents the machine learning pipeline as a directed acyclic graph (DAG), which is used for training and inference at the same time, without specifying a pipeline for training and a separate pipeline for inference.

Dagli uses simple and readable ML pipeline definitions, and also includes a large number of static types and immutability, fundamentally designed to eliminate most potential logic errors. In addition, Dagli is highly portable, and users can use it in servers, Hadoop, CLI, IDE, and any JVM context on any platform.

Guess you like

Origin www.oschina.net/news/120669/linkedin-dagli