Alluxio AI new product release: seamless integration with low-cost object storage AI training solution

(October 19, 2023, Beijing) Alluxio, as a data platform company that hosts various data-driven workloads, has launched the new Alluxio Enterprise AI high-performance data platform, designed to meet the needs of artificial intelligence (AI) and machine learning ( ML) loads are placing increasing demands on enterprise data infrastructure. The Alluxio Enterprise AI platform comprehensively optimizes the performance, data accessibility, scalability, and cost-effectiveness of enterprise AI and analytics infrastructure, powering generative AI, computer vision, natural language processing, large language models, and high-performance data analytics. The development of a generation of data-intensive applications.

To stay competitive and stand out from the competition, companies are working hard to modernize their data and AI infrastructure. In the process, entrepreneurs also realized that traditional data infrastructure could no longer match the needs of the next generation of data-intensive AI workloads. Various challenges often encountered in the advancement of AI projects, such as low performance, poor data accessibility, GPU scarcity, complex data engineering, and underutilization of resources, seriously hinder enterprises from obtaining data value. According to Gartner ® research, “The value of actionable AI lies in its ability to quickly develop, deploy, adjust and maintain in various environments of the enterprise. Taking into account engineering complexity and faster market response requirements, develop more flexible AI engineering data "By 2026, enterprises that use AI engineering to build and manage adaptive AI systems will surpass their peers in AI model operability by at least 25% %." 

Li Haoyuan, founder and CEO of Alluxio, said: "Alluxio uses the most advanced big data and AI platform to empower the world's leading enterprise customers. Today we have taken another big step forward." "Alluxio Enterprise AI provides customers with efficient AI solutions to help enterprises accelerate AI workloads and maximize the value of data. Future business leaders will know how to leverage transformative AI to advance data-driven, build and maintain AI infrastructure through the latest technology, and achieve ultra-high performance , seamless access and convenient management.”

After the release of this new version, Alluxio has expanded from one product to two product portfolios - Alluxio Enterprise AI and Alluxio Enterprise Data, to fully meet the diverse needs of analysis and AI. As a brand-new product, Alluxio Enterprise AI is built on the distributed system experience accumulated by Alluxio Enterprise Edition for many years, and adopts a new architecture optimized for AI/ML workloads. Alluxio Enterprise Data is the next generation version of Alluxio Enterprise Edition in the big data direction (parallel to Alluxio Enterprise AI) and will continue to be the ideal choice for enterprises focused on analytical workloads.

Accelerate end-to-end machine learning workflows

Alluxio Enterprise AI enables enterprises' AI infrastructure to run on existing data lakes with high performance, seamless data access, scalability, and cost-effectiveness. It helps leaders and practitioners in data and AI achieve four key goals of AI projects:

  1. High-performance model training and deployment to quickly produce business results;
  2. Seamless access to data across regions and cloud workloads;
  3. It can be infinitely expanded and has been rigorously tested internally by Internet giants;
  4. There is no need to use expensive dedicated storage and it can be deployed on the existing technology stack to ensure maximum return on investment.

After enterprises use Alluxio Enterprise AI, they are expected to achieve training speeds up to 20 times faster than using object storage that provides commercial services, model service speeds up to 10 times, GPU utilization of more than 90%, and AI infrastructure cost savings of up to 90%.

Alluxio Enterprise AI has a distributed system architecture with decentralized metadata that eliminates performance bottlenecks when accessing massive amounts of small files (common in AI workloads). No matter the size or number of files, unlimited scalability beyond traditional architectures is ensured. Unlike traditional analytics, distributed caching is tailored to AI load I/O patterns. Additionally, analytics workloads and complete machine learning workflows from data ingestion to ETL (extract, transform, load), preprocessing, training, and serving are supported.

Alluxio Enterprise AI includes the following key features:

  • High-performance model training and model serving - Alluxio Enterprise AI significantly improves enterprise model training and serving performance on existing data lakes. An enhanced API set for model training enables 20x better performance than commercial object storage. For model serving, Alluxio provides ultra-high concurrency, achieving up to 10x speedup when using models in offline training clusters for online inference.
  • Intelligent distributed caching for AI workload I/O patterns - Alluxio Enterprise AI's distributed caching feature enables the AI ​​engine to read and write data through the high-performance Alluxio cache instead of slow data lake storage. Alluxio's intelligent caching strategy is specifically tailored for the I/O patterns of the AI ​​engine, including sequential access to large files, random access to large files, and access to massive small files. This optimization helps data-hungry GPUs achieve high throughput and low latency. The training cluster continuously obtains data from the high-performance distributed cache and can achieve more than 90% GPU utilization.
  • Seamless data access for AI workloads across on-premises and cloud environments  - Alluxio Enterprise AI provides enterprises with a unified management interface to easily manage AI workloads across different infrastructure environments. The product provides a true data source for machine learning workflows, essentially eliminating the bottleneck of large enterprise data lake silos. Through Alluxio Enterprise AI, a standard data access layer, enterprises can seamlessly share data across different business units and geographies.
  • A new distributed system architecture that has been rigorously tested on a large scale - Alluxio Enterprise AI platform is built on the innovative decentralized architecture DORA (Decentralized Object Repository Architecture). This architecture provides an infinitely scalable foundation for AI workloads, allowing the AI ​​platform to process up to 100 billion objects through commercial object storage, including Amazon S3. The new architecture leverages Alluxio's proven expertise in distributed systems to address the growing challenges of system scalability, metadata management, high availability and performance.

"As organizations expand the use of AI across their businesses, optimizing performance, cost and GPU utilization for next-generation workloads will become critical," said Mike Leone, analyst at Enterprise Strategy Group. "Alluxio has the unique These advantageous products can truly help data and AI teams achieve higher performance, seamless data access, and convenient management of model training and model services."

“We work closely with Alluxio, and the Allxuio platform is critical to our data infrastructure,” said Rob Collins, Director of Analytics Cloud Engineering at Aunalytics. “Aunalytics is very excited about Alluxio’s new distributed system for enterprise AI and is optimistic about the new product. Huge potential in the AI ​​industry.”

“The large language model trained in-house supports our Q&A application and recommendation engine, greatly enhancing user experience and engagement,” said Hu Mengyu, a software engineer on the Zhihu data platform team. “In our AI infrastructure, Alluxio At the core. After using Alluxio as the data access layer, our model training performance increased by 3 times, deployment performance increased by 10 times, and GPU utilization doubled. Alluxio's Enterprise AI platform uses a new DORA architecture to support access Massive small files, we are looking forward to this. As the AI ​​wave is coming, Alluxio's new products give us more confidence in supporting AI applications."

Deploy Alluxio in machine learning workflows

Gartner  research shows that data accessibility and data volume/complexity are one of the three major challenges encountered by organizations in applying AI technology. Alluxio Enterprise AI can be added to existing AI infrastructure consisting of AI computing engines and data lake storage. Alluxio sits somewhere between compute and storage, working across model training and model serving in machine learning workflows for maximum speed and optimal cost. For example, using PyTorch as the training and serving engine and Amazon S3 as the existing data lake:

  • Model training: When a user trains a model, the PyTorch data loader loads datasets from the virtual local path /mnt/alluxio_fuse/training_datasets. The data loader does not load data directly from S3, but from the Alluxio cache. During the training process, the cached data set will be used across multiple epochs, so the overall training speed is no longer bottlenecked by access to S3. In other words, Alluxio accelerates training by shortening data loading, eliminating GPU idle waiting time, and improving GPU utilization. After the model training is completed, PyTorch writes the model file to S3 through Alluxio.
  • Model service: The latest trained model needs to be deployed to the inference cluster. Multiple TorchServe instances concurrently read model files from S3 at the same time. Alluxio caches these latest model files from S3 and provides them to the inference cluster with low latency. Therefore, as soon as the latest models are available, downstream AI applications can use them for inference.

Platform integration with existing systems

To integrate Alluxio with existing platforms, users can deploy an Alluxio cluster between compute engines and storage systems. On the computing engine side, Alluxio can be seamlessly integrated with popular machine learning frameworks such as PyTorch, Apache Spark, TensorFlow, and Ray. Enterprises can integrate Alluxio with these computing frameworks through REST API, POSIX API or S3 API.

On the storage side, Alluxio can connect any type of file system or object storage located anywhere (on-premises, in the cloud, or both). Supported storage systems include OSS, COS, BOS, OBS, Amazon S3, Google GCS, Azure Blob Storage, MinIO, Ceph, HDFS, etc.

Alluxio runs on-premises and in the cloud, on physical machines or in containerized environments. Supported cloud platforms include Alibaba Cloud, Tencent Cloud, Baidu Cloud, Huawei Cloud, AWS, GCP, Azure Cloud, etc.

Download resources

Alluxio Enterprise AI download link: https://www. alluxio.io/download/

AI Infra Day

At the AI ​​Infra Day on October 25th, Western Time, Alluxio will publicly demonstrate its newly released Alluxio Enterprise AI platform for the first time. AI Infra Day is an online event for developers that mainly discusses the challenges and various solutions in building high-performance, scalable and cost-effective AI infrastructure. Special guests include Wanchao Liang (Meta), Sally (Mihyoung) Lee (Uber) and Fan Bin (Alluxio). Registration for the event is now open: https://www.alluxio.io/ai-infra-day-2023/.

About Alluxio

Alluxio is the world's leading provider of high-performance data platforms for analytics and AI, accelerating enterprise AI product value realization and maximizing infrastructure return on investment. The Alluxio data platform sits between compute and storage systems, providing a unified view of workloads on the data platform at every stage of the data workflow. The platform provides high-performance data access no matter where the data resides, simplifies data engineering, improves GPU utilization, and reduces cloud computing and storage costs. Enterprises can significantly accelerate model training and model serving and build AI infrastructure on existing data lakes without using dedicated storage.

With the support of leading investors, Alluxio provides services to global technology, Internet, financial and telecommunications companies. Currently, 9 of the top 10 Internet companies in the world are using Alluxio. For more information, please  visit http://www.alluxio.com.cn .

Guess you like

Origin www.oschina.net/news/262541