Archive for Alluxio

Alluxio Closes Strong Q2 with Customer Growth, Sub-Millisecond Latency Capability for AI Data & Record MLPerf Storage v2.0 Benchmark Results

Posted in Commentary with tags on August 27, 2025 by itnerd

 Alluxio today announced strong results for the second quarter of its 2026 fiscal year. During the quarter, the company launched Alluxio Enterprise AI 3.7, a major release that delivers sub-millisecond TTFB (time to first byte) latency for AI workloads accessing data on cloud storage.

Alluxio also reported new customer wins across multiple industries and AI use cases, including model training, model deployment, and feature store query acceleration. In addition, the MLPerf Storage v2.0 benchmark results underscored Alluxio’s leadership in AI infrastructure performance, with the platform achieving exceptional GPU utilization and I/O acceleration across diverse training and checkpointing workloads.

Key Features of Alluxio Enterprise AI 3.7

  • Ultra-Low Latency Caching for Cloud Storage – Alluxio AI 3.7 introduces a distributed, transparent caching layer that reduces latency to sub-millisecond levels while retrieving AI data from cloud storage. It achieves up to 45× lower latency than S3 Standard and 5× lower latency than S3 Express One Zone, plus up to 11.5 GiB/s (98.7 Gbps) throughput per worker node, with linear scalability as nodes are added.
  • Enhanced Cache Preloading – The Alluxio Distributed Cache Preloader now supports parallel loading, delivering up to 5× faster cache preloading to ensure hot data availability for faster AI training and inference cold starts.
  • Role-Based Access Control (RBAC) for S3 Access – New granular RBAC capabilities allow tight integration with identity providers (OIDC/OAuth 2.0, Apache Ranger), controlling user authentication, authorization, and permitted operations on cached data.

Customer Momentum in H1 2025

The first half of 2025 saw record market adoption of Alluxio AI, with customer growth exceeding 50% compared to the previous period. Organizations across tech, finance, e-commerce, and media sectors have increasingly deployed Alluxio’s AI acceleration platform to enhance training throughput, streamline feature store access, and speed inference workflows. With growing deployments across hybrid and multi-cloud environments, demand for Alluxio AI reflects rapidly rising expectations for high-performance, low-latency AI data infrastructure. Notable customers added in the half include:

  • Salesforce
  • Dyna Robotics
  • Geely

Substantial I/O Performance Gains Confirmed in MLPerf Storage v2.0 Benchmark

Alluxio’s distributed caching architecture underscores its commitment to maximizing GPU efficiency and AI workload performance. In the MLPerf Storage v2.0 benchmarks:

  • Training Throughput
    • ResNet50: 24.14 GiB/s supporting 128 accelerators with 99.57% GPU utilization, scaling linearly from 1 to 8 clients and 2 to 8 workers.
    • 3D-Unet: 23.16 GiB/s with 8 accelerators, 99.02% GPU utilization, similarly scaling linearly.
    • CosmoFlow: 4.31 GiB/s with 8 accelerators, utilizing 74.97%, nearly doubling performance when scaling clients.
  • LLM Checkpointing
    • Llama3-8B: 4.29 GiB/s read and 4.54 GiB/s write (read/write times: 24.44s and 23.14s).
    • Llama3-70B: 33.29 GiB/s read and 36.67 GiB/s write (read/write times: 27.39s and 24.86s).

Availability 

Alluxio Enterprise AI version 3.7 is available here: https://www.alluxio.io/demo

Alluxio Enterprise AI 3.6 Accelerates Model Distribution, Optimizes Model Training Checkpoint Writing, and Enhanced Multi-Tenancy Support

Posted in Commentary with tags on May 20, 2025 by itnerd

Alluxio today announced the release of Alluxio Enterprise AI 3.6, delivering breakthrough capabilities for model distribution, model training checkpoint writing optimization, and enhanced multi-tenancy support. This latest version enables organizations to dramatically accelerate AI model deployment cycles, reduce training time, and ensure seamless data access across cloud environments.

AI-driven organizations face increasing challenges as model sizes grow and inference infrastructures span multiple regions. Distributing large models from training to production environments introduces significant latency issues and escalating cloud costs, while lengthy checkpoint writing processes substantially slow down the model training cycle.

Alluxio Enterprise AI version 3.6 includes the following key features:

●      High-Performance Model Distribution – Alluxio Enterprise AI 3.6 leverages Alluxio Distributed Cache to accelerate model distribution workloads. By placing the cache in each region, model files need only be copied from the Model Repository to the Alluxio Distributed Cache once per region rather than once per server. Inference servers can then retrieve models directly from the cache, with further optimizations including local caching on inference servers and memory pool utilization. Benchmarks demonstrate impressive throughput with Alluxio AI Acceleration Platform achieving 32 GiB/s throughput, exceeding the 11.6 GiB/s available network capacity by 20 GiB/s.

●      Fast Model Training Checkpoint Writing – Building on the CACHE_ONLY Write Mode introduced earlier, version 3.6 debuts the new ASYNC write mode, delivering up to 9GB/s write throughput in 100 Gbps network environments. This enhancement significantly reduces the time needed for model training checkpoints by writing to the Alluxio cache instead of directly to the underlying file system, avoiding network and storage bottlenecks. With ASYNC write mode, checkpoint files are written to the underlying file system asynchronously to optimize training performance.

●      New Management Console – Alluxio 3.6 introduces a comprehensive web-based Management Console designed to enhance observability and simplify administration. The console displays key cluster information, including cache usage, coordinator and worker status, and critical metrics such as read/write throughput and cache hit rates. Administrators can also manage mount tables, configure quotas, set priority and TTL policies, submit cache jobs, and collect diagnostic information directly through the interface without command-line tools.

This release also introduces several enhancements to Alluxio administrators:

●      Multi-Tenancy Support – This release brings robust multi-tenancy capabilities through seamless integration with Open Policy Agent (OPA). Administrators can now define fine-grained role-based access controls for multiple teams using a single, secure Alluxio cache.

●      Multi-Availability Zone Failover Support – Alluxio Enterprise AI 3.6 adds support for data access failover in multi-Availability Zone architectures, ensuring high availability and stronger data access resilience.

●      Virtual Path Support in FUSE – The new virtual path support allows users to define custom access paths to data resources, creating an abstraction layer that masks physical data locations in underlying storage systems.

Availability 

Alluxio Enterprise AI version 3.6 is available for download here: https://www.alluxio.io/demo

2025 Predictions by Haoyuan Li, Founder and CEO, Alluxio

Posted in Commentary with tags on November 18, 2024 by itnerd

Here are some 2025 Technology Predictions about major developments from Haoyuan Li, founder and CEO of Alluxio. This is what he sees in AI/ML, Data & Analytics, Cloud, Modern Data Center and DevOps in 2025. 

Multi-Modal Training Will Become More Mainstream – In 2025, multi-modal training, which integrates different types of data—such as text, images, audio, and video—will become a more dominant approach in model training. This shift is driven by the need for AI systems to better understand and process the complexity of real-world data, allowing for richer and more context-aware applications. For example, multi-modal models can improve use cases like autonomous driving, where understanding visual, auditory, and textual information is critical. The rise of these models will also spur demand for more advanced hardware and storage solutions, as the complexity of training environments continues to grow.

Pre-Training Will Become a Key Differentiator for Organizations Adopting LLMs – By 2025, pre-training will emerge as a crucial differentiator among organizations developing large language models (LLMs). As the AI landscape evolves, access to vast amounts of high-quality data—especially industry-specific data—will become a major competitive advantage. Companies that can effectively harness big data infrastructure to leverage their large-scale datasets will be better positioned to fine-tune their models and deliver more effective, specialized solutions. However, this also introduces a significant bottleneck. Preparing and curating the right data for pre-training is increasingly complex, and companies without robust big data infrastructure will struggle to keep up. Efficiently handling this data preparation, cleaning, and transformation process will become a critical challenge in the race to develop more powerful and relevant LLMs.

Overcoming Data Access Challenges Becomes Critical for AI Success – In 2025, organizations will face increasing pressure to solve data access challenges as AI workloads become more demanding and distributed. The explosion of data across multiple clouds, regions, and storage systems has created significant bottlenecks in data availability and movement, particularly for compute-intensive AI training. Organizations will need to efficiently manage data access across their distributed environments while minimizing data movement and duplication. We’ll see an increased focus on technologies that can provide fast, concurrent access to data regardless of its location while maintaining data locality for performance. The ability to overcome these data access challenges will become a key differentiator for organizations scaling their AI initiatives.

AI-Driven Cloud Economics Reshape Infrastructure Decisions – In 2025, organizations will fundamentally reshape their cloud strategies around AI economics. The focus will shift from traditional cloud cost optimization to AI-specific ROI optimization. Organizations will develop sophisticated modeling capabilities to understand and predict AI workload costs across different infrastructure options. This will lead to more nuanced hybrid deployment strategies where companies carefully balance the cost-performance trade-offs of training and inference workloads across cloud providers and on-premises infrastructure.

Maximizing GPU Utilization Becomes the New Standard – In 2025, as the size of AI model training datasets continue to grow exponentially, maximizing GPU utilization will become the primary design goal for modern datacenters. Organizations will face mounting pressure to optimize their expensive GPU infrastructure investments. This shift will drive innovations in hardware and software design to sustain the massive read bandwidths necessary for training and minimize checkpoint-saving times that cause training pauses. Success will be measured by how effectively datacenters can keep their GPU resources busy while managing larger model checkpoints and growing data requirements.

MLOps Evolution to AIOps – In 2025, we’ll see the evolution from traditional MLOps to comprehensive AIOps platforms that manage the entire AI system lifecycle. These platforms will integrate sophisticated monitoring and automation capabilities for both models and infrastructure, enabling predictive maintenance and automatic optimization of AI systems. Teams will adopt practices that treat AI models as living systems rather than static deployments, with continuous learning and adaptation capabilities built into the deployment pipeline. This shift will require new tools and practices for version control, testing, and deployment that can handle the complexity of multi-modal models and distributed training environments.

Today Is World Backup Day

Posted in Commentary with tags , , , on March 31, 2024 by itnerd

 World Backup Day 2024 is today. 

Founded in 2011 by Ismail Jadun, a digital strategy and research consultant, World Backup Day is an annual event aimed at raising awareness about the importance of regularly backing up personal and professional data to prevent data loss. The day encourages individuals and businesses to take the pledge to secure their data by creating copies in different locations, ensuring that important information is protected against unforeseen events.

Carl D’Halluin, CTO of Datadobi, and Oleksandr Maidaniuk, VP of Technology at Intellias, and Bin Fan, Chief Architect and VP of Open Source at Alluxio, had this to say about this important day: 

Carl D’Halluin, CTO, Datadobi

“This World Backup Day, I want to remind everyone that protecting your data with backups isn’t just a technical formality. Given the virtually unavoidable risks of ransomware, malicious or accidental deletions, and countless other threats – it’s absolutely crucial for the health of your business.

The first step? Get your arms around your data. You cannot protect it, if you do not know what you have. Then…

A well-thought-out and tested data backup strategy, together with a combination of robust data security and management solutions, can significantly enhance operations resilience. Add to that the crucial but sometimes missed step of a “golden copy” (i.e., an immutable copy of your business-critical data in a secure and remote site) and your business will be protected today, as well as ideally positioned to support business continuity well into the future.”

Oleksandr Maidaniuk, VP of Technology, Intellias

“Data is the virtual lifeblood of today’s organizations, so as World Backup Day 2024 rolls around, we need to appreciate how crucial regular data backups are for keeping our businesses running without interruption, even in the face of a simple outage or a manmade or natural disaster.

Of course, implementing a seamless backup and disaster recovery (DR) strategy is easier said than done, due to the complicated interplay of technological, regulatory, and operational factors. The heterogeneous nature of data and technology platforms and the increasingly complicated and stringent compliance mandates combined with the need to minimize – if not eliminate – downtime requires a nuanced approach.

At the end of the day, it all boils down to knowing how to strike the perfect balance between protecting all our data thoroughly and using our resources wisely. This way, we can get back on our feet fast after any setback without disturbing our daily work. Savvy folks in data management understand that if we don’t have this kind of know-how already in our team, we might need to team up with a reliable partner. This partner should be all about giving businesses the latest, customized backup solutions that do more than just keep data safe; they should fit exactly with what we need and want to achieve. The ideal partner will be just that – a partner that acts as an extension of your internal capabilities – enabling you to leverage advanced technologies like cloud storage, automation, and AI and in doing so, enhance the resilience of your businesses, making data protection seamless and reliable. On World Backup Day and every day, let’s pledge to prioritize backup, DR, and business continuity to ensure our data remains safe, our operations resilient, and our future secure.”

Bin Fan, Chief Architect and VP of Open Source, Alluxio

“Every year, the amount of data we produce increases significantly. World Backup Day is a call to action, urging us to reconsider our strategies for simplifying backup and recovery to keep pace with the significant increase in data production each year.

As we scale the data storage, timely data movement is a necessity, whether for archiving data in more economical storage or for duplicating data to another center as part of a disaster recovery plan. However, this process can be complex and operational-heavy. We should keep optimizing and streamlining data movement across multiple storage systems.

On this World Backup Day, let’s commit to exploring more efficient and effective ways to protect and manage our growing data, ensuring we’re prepared for any unforeseen circumstances that may arise.”

Molly Presley, SVP of Global Marketing, Hammerspace:

“On this World Backup Day, it’s important to remember the increasing role of automation in accurately identifying, protecting, and utilizing an organization’s data assets. In our current data-focused society, detailed, actionable metadata is crucial for utilizing data fully. However, managing vast amounts of unstructured data across various storage systems, locations, and multiple cloud platforms can be difficult and require significant time and effort. Furthermore, as the number of devices that generate data increases, relying solely on manual processes is time-consuming and risky.

Implementing global-level data protection services with automated policies allows organizations to identify newly created data across the entire data environment, automate data copy creation controls and data services, and ensure global data protection on any infrastructure as well as compliance with corporate governance requirements. Automated, global-level data protection empowers organizations to simplify their data management and unlock the full potential of their data. It will become the new norm for data protection.”

Here’s Some 2024 Predictions From Alluxio

Posted in Commentary with tags on December 22, 2023 by itnerd

Alluxio has the following 2024 Technology Predictions about major developments that are in the pipeline. Haoyuan Li, founder and CEO, describes what he sees in AI/ML, Data & Analytics, Cloud, DevOps and Storage in 2024.

AI/ML 

Compute Power is the New Oil

The soaring demand for GPUs has outpaced industry-wide supply, making specialized compute with the right configuration a scarce resource. Compute power has now become the new oil, and organizations are wielding it as a competitive edge. In 2024, we anticipate even greater innovation and adoption of technologies to enhance compute efficiency and scale capacity as AI workloads continue to explode. In addition, specialized AI hardware, like TPUs, ASICs, FPGAs and neuromorphic chips, will become more accessible.

Moving GenAI from Pilots to Production

GenAI is influencing organizations’ investment decisions. While early GenAI pilots show promise, most organizations remain cautious about full production deployment due to limited hands-on experience and rapid evolution. In 2023, most organizations are on small and targeted trials to assess benefits and risks carefully. As GenAI technologies mature and become more democratized through pre-trained models, cloud computing, and open-source tools, budget allocations will shift more heavily toward GenAI in 2024.

Balancing In-House and Vendor-Provided LLMs

To leverage the power of LLMs, organizations need to decide between building their own models, utilizing a closed-source model like GPT4 via APIs, or fine-tuning a pre-trained open-source LLM. In 2024, as LLMs keep iterating, organizations would not want to be “locked in” to one model or one vendor. They will likely adopt a hybrid approach, balancing the use of pre-trained models with developing in-house custom models when there are tighter privacy, IP ownership, and security requirements. 

Green AI

In 2024, more organizations will recognize the pressing sustainability challenges posed by AI projects as adoption accelerates. Technological advancements like optimized data architectures, reduced data copies, and renewable energy tapping will help. However, technology alone is not enough. Organizations will also need to implement governance processes and human-centered values that ensure AI projects drive business value without negatively impacting the environment. Organizations that proactively embrace green AI principles in 2024 will gain a competitive advantage and build public trust.

Data & Analytics 

Overcoming Data Silo Challenges

Data silos remain a challenge for organizations – many analytics and AI systems spread across regions, clouds, and platforms, resulting in a vast amount of data duplication and separate governance models. In 2024, to accelerate time-to-insights and scale analytics and AI initiatives, organizations will increasingly need to manage distributed data. More will develop data strategies for unified management of scattered data through flexible orchestration, abstraction, and virtualization.

Cloud 

Cloud Cost Optimization Will be More Strategic in 2024

In 2024, cloud cost optimization will become more strategic. Beyond tactical cost management, such as rightsizing and adopting spot instances, organizations will undertake more strategic evaluations and optimizations. These will modernize and optimize cloud-deployed systems for cost-efficiency, with some workloads potentially reverting to on-premises. Cloud ROI depends on holistic optimization spanning architecture designs, cost monitoring, negotiations with cloud vendors, and continuous re-evaluation.

Hybrid and Multi-cloud Acceleration

In 2024, the adoption of hybrid and multi-cloud strategies is expected to accelerate, both for strategic and tactical reasons. From a strategic standpoint, organizations will aim to avoid vendor lock-in and will want to retain sensitive data on-premises while still utilizing the scalable resources offered by cloud services. Tactically, due to the continued scarcity of GPUs, companies will seek to access GPUs or specific resources and services that are unique to certain cloud providers. A seamless combination of cross-region and cross-cloud services will become essential, enabling businesses to enhance performance, flexibility, and efficiency without compromising data sovereignty.

DevOps

The Integration of DevOps and MLOps to Streamline AI Projects

In 2024, MLOps will increasingly integrate with DevOps to create more streamlined workflows for AI projects. The combination of MLOps and DevOps creates a set of processes and automated tools for managing data, code, and models to enhance the efficiency of machine learning platforms. Data scientists and software developers will get the freedom to transition to high-value projects without the need for manually overseeing models. The trend is driven by streamlining the process of delivering models to production to reduce time-to-value.

Storage

From Specialized Storage to Optimized Commodity Storage for AI Platform

The growth of AI workloads has driven the adoption of specialized high-performance computing (HPC) storage optimized for speed and throughput. But in 2024, we expect a shift towards commoditized storage. Cloud object stores, NVMe flash, and other storage solutions will be optimized for cost-efficient scalability. The high cost and complexity of specialized storage will give way to flexible, cheaper, easy-to-manage commodity storage tailored for AI needs, allowing more organizations to store and process data-intensive workloads using cost-effective solutions.