How constant frustration drives storage developments

Comment Storage is always in flux, from underlying media to new ways to access on-drive storage and specialized large language model databases. Workloads need more data. They need it faster. They need it for less money. And these are the three frustrations that have spurred the 18 developments that are transforming the storage landscape and making it possible to store more data, and access it faster and more cheaply.

We’ll start at the media level. Solidigm signalled the dawn of the very high capacity SSD movement with its P5336 61.44TB drive in July 2023, which used QLC NAND. This was followed by Samsung’s BM1443 last July, also using QLC flash with a 61.44TB capacity. But Solidigm upped the stakes, along with Phison, both launching 128 TB-class SSDs (122.88TB effective) in November. Solidigm used the PCIe gen 4 bus while Phison chose the faster PCIe gen 5. And, of course, we have Pure Storage with its proprietary flash drives offering 150TB of capacity with 300TB on its roadmap.

High-capacity NAND could also get closer to processors with Western Digital’s to be spun-off Sandisk talking about a High Bandwidth Flash (HBF) concept, with NAND dies stacked in layers above an interposer unit connecting them to a GPU, or other processor, in the same way as High Bandwidth Memory.

PLC (penta level cell) NAND with its 5 bits/cell capacity is not being productized for now. 3D NAND layer count increases, past the 300-layer level, provide sufficient capacity increase headroom with QLC (4bits/cell) technology. 

Processors could escape memory capacity limits by using 3D DRAM, if it gets commercialized. The research and development projects in this area are multiplying with, for example, Neo Semiconductor. The Computer eXpress Link (CXL) area is fast developing as well, with processors getting the ability to access shared external pools of memory, witness UnifabriX and Panmnesia. We await widespread use of the PCIe gen 5 and 6 busses and CXL-capable system software.

There is a risk that CXL external memory pooling could fail to cross the chasm into mainstream adoption, like composable systems, as the actual benefits may turn out to be not that great.

The legacy hard disk drive market is undergoing a transition as well, to heat-assisted magnetic recording (HAMR) and raw capacities in the 3.2TB platters plus and greater area. This needs new formulations of magnetic recording media to stably retain very small bit areas at room temperature, with bit area laser heating needed to lower its resistance to magnetic polarity change. Seagate’s transition to HAMR is well underway, after many years of development, while Western Digital has just signalled its migration to HAMR in a year’s time. We understand the third HDD manufacturer, Toshiba, will follow suit.

We are looking at disk drives heading into 40 TB-plus capacities in the next few years, and retaining a 6x cheaper TCO advantage over SSDs out to 2030, if Western Digital is right in its assumptions.

New forms of laser-written storage on glass or ceramic platters are being developed, with the hope of providing faster archival data access than tape plus having an even longer life. We have Cerabyte with its ceramic platters, picking up an investment from Pure Storage, Optera with fluorescence-based storage and Folio Photonics. Tape still has capacity advances in the LTO roadmap but there is just one drive manufacturer, IBM, and, overall, tape is a complacent legacy technology that could be disrupted if one of these laser-written technologies succeeds.

SSD data access at the file level has been accelerated with Nvidia’s GPU Direct protocol, in which data is transferred by RDMA direct from a drive into a GPU server’s memory with no storage array controller or x86 server CPU involvement. Now the same technique is being used in GPUDirect for objects, and promises to make the data retained in object storage SSD-based systems available to GPUs for LLM (Large Language Model) training and inference. Suppliers like Cloudian, Scality and MinIO are pressing ahead with fast object data access as is VAST Data. 

This will probably encourage some object storage systems to migrate to SSD storage and away from HDDs.

A new scale-out storage array architecture has been pioneered by VAST Data; Disaggregated Shared Everything (DASE), with separate controllers and storage nodes linked across an NVMe fabric ad all controllers able to see all the drives. HPE has its own take on DASE with its Alletra MP X10000. Quantum’s Myriad OS is comparable array software and NetApp has an internal ONTAP Data Platform for AI development.

Such systems scale out much more than clusters and can provide high levels of performance. With parallel NFS, for example, they can reach parallel file system performance speeds.

Another storage development is the spread of key-value storage technology to underly traditional storage protocols such as block, file, and object. HPE (X10000 software) and VAST Data are active in this area. The resulting systems can have, in theory, any data access protocol layered on top of a KV engine and provide faster protocol data access than by having one or more access protocols implemented on top of, for example, object storage. Ceph comes to mind here.

Public cloud block storage is getting accelerated by using ephemeral instances, with Silk and Volumez providing software to achieve this. Silk is the most mature and focussed solely on accelerating databases, such as Oracle, in the cloud. Volumez has lately implemented a strong focus on the generative AI use case. A third cloud block storage startup, Lucidity, aims to dynamically optimize cloud block storage and save costs.

GPU server access to stored data is being accelerated by moving the data from shared external storage to a GPU server’s local drives – so-called tier zero storage. Data from these drives can be loaded into the GPU server’s memory faster than from external storage arrays. This concept is being heavily promoted by Hammerspace and a Microsoft Azure AI supercomputer uses it as well, with checkpointing data going to these drives.

Hammerspace is an example of another storage industry development, data orchestration, in which data, file or objects, is made available to data center or remote sites or edge devices from within a global namespace with distributed metadata and some kind of caching playing a role to make distant data access seem local. Arcitecta is another supplier of data orchestration software.We expect data management suppliers, such as Komprise, and cloud file services players, derived from the old sync ’n share collaboration software technology, like CTERA, Nasuni and Panzura, to enter this field as well.

We are seeing the rise of vector database storage, holding multi-dimensional hashes – vectors – of unstructured data item, such as word fragments and words, parts of images, videos and audio records. Such vectors are used by LLMs in their semantic search activities and dedicated vector data base suppliers such as Pinecone and Zilliz have started up. They say they offer best of breed facilities whereas multi-data type database suppliers, such as SingleStore, are compromised because they can’t focus everything on optimising for vector access.

AI data pipeline technology is being developed to find, select, filter and feed data to LLMs for raining and inference. Many suppliers are developing this capability, including Komprise, Hammerspace and the vector database companies, both dedicated and multi-type. 

AI, as we can see, is transforming the storage industry. It is affecting backup suppliers as they see their vast troves of backup data being being a great resource for an organization’s Retrieval-Augmented Generation (RAG) -influenced LLMs, aimed at making generally-trained LLMs applicable to proprietary data sources and less likely to produce inaccurate responses (hallucinate).

Backup of SaaS application data is being developed and promoted by Commvault, HYCU, Rubrik, Salesforce, Veeam and many others others and will, we think, grow and grow and grow.

Two other developments that are ongoing, but happening at a slower rate, are disaggregated or Web3 storage, in which an on-premises data center’s spare storage capacity is made available to a public cloud storage company which sells it at a substantially cheaper price than mainstream public cloud storage such as AWS S3 or Azure Blob. Suppliers such as Storj and Cubbit are active here.

Lastly, the data protection industry may have started showing signs of consolidation, with Cohesity buying Veritas. We stress the “may.”

The storage industry is multi-faceted and fast developing, because the amount of data to be stored is rising and rising, as are its costs, and the speed of access is a constant obstacle to getting processor work done. It is these three pressures that cause the constant frustration which drives businesses to improve and re-invent storage.