Angel 3.2.0 new version released! Graph computing power is strengthened again

Version 3.2.0 of the Angel project is released!

Angel is Tencent's first AI open source project. After several iterations, Angel graduated from the Linux Foundation in 2019. As the third-generation high-performance computing platform for machine learning, Angel provides full-stack machine learning capabilities and is committed to solving the problems of high-dimensional sparse large model training and large-scale distributed graph computing.

In version 3.1.0, Angel introduced graph computing capabilities for the first time, providing a large number of out-of-the-box graph algorithms, which have been widely concerned and used in the industry. In this release, Angel continues to strengthen the graph computing capabilities. Compared with the previous version, we have done a lot of optimization and provided some new features. If you are interested, please download and experience it, and look forward to your feedback.

The main new features are as follows:

 

Graph computing layered abstraction and flexible extension

Angel 3.1.0 version provides a large number of out-of-the-box graph algorithms, but in the process of use, some users feedback that they need to do customized secondary development and new algorithm research and development according to their own business scenarios. Therefore, we made a systematic three-layer abstraction for the graph computing framework: the graph computing engine layer, the graph operation operator layer, and the graph algorithm layer, and provided more than a dozen commonly used operator abstractions in the graph operation operator layer, such as init, get, walker, sample, etc. and custom operator interfaces. Based on the above abstraction, users can quickly and flexibly extend or implement custom graph algorithms at the algorithm layer.

Parameter server and MPI mixed operation mode

There are many kinds of graph algorithms, which can be mainly divided into three categories: traditional graph mining algorithms, graph representation learning algorithms and graph neural network algorithms. Each algorithm has different computing processes and different requirements for computing platforms, which leads to Graph computing solutions are very fragmented, and it is difficult to support all types of algorithms in one platform. Angel is a computing platform based on the parameter server model. In the past versions, we have done a lot of optimization and functional enhancements to the parameter server: for example, the optimization of the algorithm process, the custom PS function and the calculation pushdown, etc. Angel can support these three types of algorithms at the same time, and most of the algorithms have good performance, but there are still a few algorithms that are not computationally efficient, which is mainly limited by some limitations of the parameter server mode: for example, data interaction is not direct enough, repetitive Storage wastes memory space, large-scale task connections explode, and dense model aggregation efficiency is not optimal. In view of the above reasons, we began to explore the next-generation graph computing framework in version 3.2.0, trying to combine the advantages of the parameter server mode and the MPI mode. The specific method is to start Angel PS in the Worker (or Executor) in an embedded way. , and optimize the network communication topology. The nodes can adopt the most efficient communication method according to the algorithm characteristics. In one model, the PS mode and the common ring communication topology of MPI can be used at the same time. This feature is still in the experimental stage, and version 3.2.0 will first make some attempts on the walking algorithm.

Adaptive model partitioning

The partition routing of the model generally has two methods, range and hash. They have their own advantages and disadvantages. For example, the range partition method occupies less memory, and the calculation is fast, but it is easy to cause unbalanced computing load and often requires the node id to be a numeric type and encoded in Continuous space is more efficient, and some preprocessing operations need to be done in advance before graph training. The hash partitioning method can solve the load imbalance problem, and can support any type of node id, without the need to do coding preprocessing work on the graph, the incremental training of the graph algorithm of this partitioning method is also easy to support, but its memory usage is relatively high. many. We have optimized the partition routing method of the parameter server model, which can support both range and hash partitioning. In the actual graph algorithm training process, the appropriate model partitioning method can be adaptively selected according to the calculation characteristics of different algorithms, effectively solving the problem of graph training. Problems such as unbalanced load on preprocessing, storage and computation, and incremental training.

Support complex heterogeneous Graph Embedding

In reality, graph networks are often complex and heterogeneous in many business scenarios, and it is difficult for some common isomorphic GNN algorithms to learn effective expressions, so more complex heterogeneous Graph Embedding is needed to solve the problem. Graph computing platforms that support complex heterogeneous GNN algorithms often face multiple challenges: such as complex heterogeneous network storage problems, there may be many different types of nodes in the network, and each node may have multiple attributes. There may be multiple types of edges and edges with multiple attributes; another example is the computing problem of complex heterogeneous networks. Due to the existence of multiple different types of nodes, edges and attributes, it is necessary to provide a variety of operators that can support complex operations and their properties. combination for calculation. We have enriched and expanded the storage structure and calculation mode of graphs, and provided flexible custom ps func interfaces for complex operations, which can well support the storage and calculation of complex heterogeneous graph networks, and can support high-dimensional and sparse graph node features. Representation learning for heterogeneous graphs can be easily performed. At the same time, we have also added five out-of-the-box heterogeneous graph neural network algorithms, including HAN, heterogeneous GAT, heterogeneous GraphSage, IGMC edge prediction, and heterogeneous Bipartite GraphSage.

High-performance optimization for large graphs with 100 billion edges

Large-scale graph algorithms have relatively higher requirements on fault tolerance and computing performance. We have made special performance optimization for the training of large graphs with hundreds of billions of edges and performed performance tests on the shared cluster of the current network. In K-core and common friends The test results of the two algorithms are that while the memory consumption is reduced by 30%, the computing performance is also improved by 3 times.

Rich machine learning algorithm library

Added more than a dozen feature engineering methods and a multi-task learning algorithm esmm

 

For more details, please refer to the official release notes:

https://github.com/Angel-ML/angel/releases/tag/Release-3.2.0

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324142810&siteId=291194637