Looking to accelerate AI? Start with the right mix of storage

  • February 13, 2024 | Author: Peter Krass
Learn More about this topic

Article Key

That’s right, storage might be the solution to speeding up your AI systems.

Why? Because today’s AI and HPC workloads demand a delicate storage balance. On the one hand, they need flash storage for high performance. On the other, they also need object storage for data that, though large, is used less frequently.

Supermicro and AMD are here to help with a reference architecture that’s been tested and validated at customer sites.

Called the Scale-Out Storage Reference Architecture, it offers a way to deliver massive amounts of data at high bandwidth and low latency to data-intensive applications. The architecture also defines how to manage data life-cycle concerns, including migration and cold-storage retention.

At a high level, Supermicro’s reference architecture address three important demands for AI and HPC storage:

  • Data lake: It needs to be large enough for all current and historical data.
  • All-flash storage tier: Caches input for application servers and deliver high bandwidth to meet demand.
  • Specialized application servers: Offering support that ranges from AMD EPYC high-core-count CPUs to GPU-dense systems.

Tiers for less tears

At this point, you might be wondering how one storage system can provide both high performance and vast data stores. The answer: Supermicro’s solution offers a storage architecture in 3 tiers:

  • All flash: Stores active data that needs the highest speeds of storage and access. This typically accounts for just 10% to 20% of an organization’s data. For the highest bandwidth networking, clusters are connected with either 400 GbE or 400 Gbps InfiniBand. This tier is supported by the Weka data platform, a distributed parallel file system that connects to the object tier.
  • Object: Long-term, capacity-optimized storage. Essentially, it acts as a cache for the application tier. These systems offer high-density drives with relatively low bandwidth and networking typically in the 100 GbE range. This tier managed by Quantum ActiveScale Object Storage Software, a scalable, always-on, long-term data repository.
  • Application: This is where your data-intensive workloads, such as machine-learning training, reside. This tier uses 400 Gbps InfiniBand networking to access data in the all-flash tier.

What’s more, the entire architecture is modular, meaning you can adjust the capacity of the tiers depending on customer needs. This can also be adjusted to deploy different kinds of products — for example, open-source vs. commercial software.

To give you an idea of what’s possible, here’s a real-life example. One of the world’s largest semiconductor makers has deployed the Supermicro reference architecture. Its goal: use AI to automate the detection of chip-wafer defects. Using the reference architecture, the company was able to fill a software installation with 25 PB of data in just 3 weeks, according to Supermicro.

Storage galore

Supermicro offers more than just the reference architecture. The company also offers storage servers powered by the latest AMD EPYC processors. These servers can deliver flash storage that is ideal for active data. And they can handle high-capacity storage on physical discs.

That includes the Supermicro Storage A+ Server ASG-2115S-NE332R. It’s a 2U rackmount device powered by an AMD EPYC 9004 series processor with 3D V-Cache technology.

This storage server has 32 bays for E3.S hot-swap NVM3 drives. (E3.S is a form factor designed to optimize the flash density of SSD drives.) The server’s total storage capacity comes to an impressive 480 TB. It also offers native PCIe 5 performance.

Of course, every organization has unique workloads and requirements. Supermicro can help you here, too. Its engineering team stand ready to help you size, design and implement a storage system optimized to meet your customers’ performance and capacity demands.

Do more:

 

Related Content