Detailed explanation of the five major update highlights of Alluxio version 2.9

Detailed explanation of the five major update highlights of Alluxio version 2.9

Alluxio, the world's first open source data orchestration software developer, recently announced the official release of 2.9 free open source community version and 2.9 enterprise version!

This article will give you a quick inventory of the update highlights of 2.9:

The 2.9 official version (GA) has strong stability, good support and enterprise-level features. This article will introduce Alluxio's new architecture and how this architecture can empower the world's leading enterprises to achieve growth and enhance agility in cross-regional, cross-computing engines and storage systems for big data analysis and AI application scenarios.

Alluxio version 2.9 adds the cross-environment cluster synchronization function, supports horizontally scalable multi-tenant architecture; significantly improves the tool set and guide for deploying on Kubernetes, and enhances the manageability of Alluxio; in addition, the new version also optimizes S3 API to achieve security performance and performance improvements.

Enterprises can use Alluxio to build a cross-computing and cross-storage multi-cloud data platform. Alluxio can be deployed on any cloud platform, such as AWS, GCP and Azure, together with Spark, Presto, Trino, PyTorch and Tensorflow. At the same time, Alluxio can also be deployed in private cloud data centers or public clouds on Kubernetes.

Highlights of Alluxio Community Edition

The following functions are supported in both Alluxio 2.9 community edition and enterprise edition

Master node health status monitoring

The Alluxio master now regularly checks the combined usage of various resources, including CPU and memory usage, and infers the overall state of the system through several internal key data structures that affect performance. The health status of the Master node can be obtained by viewing the master.system.status indicator:

  • idle
  • normal operation
  • busy
  • overload

For how to use this function, you can click "Details" to view the document and learn more about the monitoring function.

Paged storage on Worker nodes (experimental feature)

The new version supports more fine-grained storage. In the past, Alluxio only supported 64MB block storage. The new version supports 1MB page-level storage, and data can be cached on Alluxio worker nodes at a finer granularity.

This feature is designed to enhance performance by improving the efficiency of the cache, which reduces read amplification when the application first accesses the underlying storage.

See the documentation for how to use:

Highlights of Alluxio Enterprise Edition

The following features are only available in Alluxio Enterprise Edition

Added cross-environment cluster synchronization function

Tenant isolation effectively prevents contention between different teams accessing shared data lake storage. Alluxio improves the scalability when deploying multiple Alluxio clusters across tenants or across environments on Kubernetes through the new cross-cluster synchronization function.

The federation of multiple Alluxio clusters is achieved through metadata synchronization. Different Alluxio instances are aware of their own modification of metadata and realize intercommunication of metadata, thereby automatically keeping metadata in sync. This feature is especially useful when deploying a satellite cluster architecture, where data producers can be isolated from data consumers while updating the data lake.

Before starting the deployment, you can view the documentation by clicking

Added Kubernetes Operator to improve the manageability of Alluxio

Running Alluxio on Kubernetes helps standardize deployment strategies, making the data stack portable to any environment. The new version adds Alluxio Operator, which simplifies the deployment and management of multiple Alluxio clusters.

Admins can now easily deploy and manage Alluxio via CRDs (Custom Resources). Using Alluxio Operator can reduce the burden of managing multiple Alluxio instances.

Before starting the deployment, click to view the documentation for details.

Strengthen S3 API security

The new version further strengthens the S3 API function. Administrators can centrally manage authentication and access control policies through a unified namespace to achieve unified security protection both locally and across cloud heterogeneous storage.

The new version adds support for the S3 API Open Authentication Protocol, ensuring Alluxio user requests are authenticated before processing them. This new capability allows data platform teams to connect to identity management systems such as PingFederate and use single sign-on (SSO).

Before starting the deployment, click to view the documentation for details.

If you want to learn more about Alluxio's dry articles, popular events, and expert sharing, click to enter [Alluxio Think Tank] :

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5904778/blog/5601712