By optimizing the snapshot creation and recovery mechanism, when Alluxio uses the embedded log, the worst-case failover time of the Alluxio Master is shortened to 5% of the original (i.e. 95%). This mechanism limits the total number of log entries that the system needs to replay in the event of a failover, enabling administrators to reduce typical failover times by 50% .
When an Alluxio cluster manages more than 100 million files at the same time, these optimizations can shorten the failover time from minutes to tens of seconds, thus avoiding planned downtime. This feature has been verified in production scenarios with a large number of small files .
The Alluxio metadata sync mechanism is an internal mechanism used to keep files and directories in the Alluxio namespace consistent with the underlying data source, and is usually triggered inadvertently when listing or preloading large directories. In the past, some users have observed a spike in memory resource consumption on the Alluxio Master due to the metadata synchronization mechanism.
High resource consumption can lead to over-provisioning of resources. In version 2.10, the memory requirement of Alluxio Master can be reduced by 90% when the synchronization interval is short , and the end-to-end performance can be improved by 2 times .
Preloading via load operations can improve SLAs for analytics workloads accessing remote data at predictable times, such as fixed times of day, and can also be used to speed up model training and reduce deployment times. These features result in 10% less cross-cluster resources required to achieve the same or higher throughput compared to version 2.9 .
This article is shared from the WeChat public account - Alluxio (Alluxio_China).
If there is any infringement, please contact [email protected] to delete it.
This article participates in the " OSC Source Creation Program ". You are welcome to join in and share it.