Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

AMD presents its vision for the AI future: open, collaborative, for everyone

Featured content

AMD presents its vision for the AI future: open, collaborative, for everyone

Check out the highlights of AMD’s Advancing AI event—including new GPUs, software and developer resources.

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD advanced its AI vision at the “Advancing AI” event on June 12. The event, held live in the Silicon Valley city of San Jose, Calif., as well as online, featured presentations by top AMD executives and partners.

As many of the speakers made clear, AMD’s vision for AI is that it be open, developer-friendly, collaborative and useful to all.

AMD certainly believes the market opportunity is huge. During the day’s keynote, CEO Lisa Su said AMD now believes the total addressable market (TAM) for data-center AI will exceed $500 billion by as soon as 2028.

And that’s not all. Su also said she expects AI to move beyond the data center, finding new uses in edge computers, PCs, smartphone and other devices.

To deliver on this vision, Su explained, AMD is taking a three-pronged approach to AI:

  • Offer a broad portfolio of compute solutions.
  • Invest in an open development ecosystem.
  • Deliver full-stack solutions via investments and acquisitions.

The event, lasting over two hours, was also filled with announcements. Here are the highlights.

New: AMD Instinct MI350 Series

At the Advancing AI event, CEO Su formally announced the company’s AMD Instinct MI350 Series GPUs.

There are two models, the MI350X and MI355X. Though both are based on the same silicon, the MI355X supports higher thermals.

These GPUs, Su explained, are based on AMD’s 4th gen Instinct architecture, and each GPU comprises 10 chiplets containing a total of 185 billion transistors. The new Instinct solutions can be used for both AI training and AI inference, and they can also be configured in either liquid- or air-cooled systems.

Su said the MI355X delivers a massive 35x general increase in AI performance over the previous-generation Instinct MI300. For AI training, the Instinct MI355X offers up to 3x more throughput than the Instinct MI300. And in comparison with a leading competitive GPU, the new AMD GPU can create up to 40% more tokens per dollar.

AMD’s event also featured several representatives of companies already using AMD Instinct MI300 GPUs. They included Microsoft, Meta and Oracle.

Introducing ROCm 7 and AMD Developer Cloud

Vamsi Boppana, AMD’s senior VP of AI, announced ROCm 7, the latest version of AMD’s open-source AI software stack. ROCm 7 features improved support for industry-standard frameworks; expanded hardware compatibility; and new development tools, drivers, APIs and libraries to accelerate AI development and deployment.

Earlier in the day, CEO Su said AMD’s software efforts “are all about the developer experience.” To that end, Boppana introduced the AMD Developer Cloud, a new service designed for rapid, high-performance AI development.

He also said AMD is giving developers a 25-hour credit on the Developer Cloud with “no strings.” The new AMD Developer Cloud is generally available now.

Road Map: Instinct MI400, Helios rack, Venice CPU, Vulcano NIC

During the last segment of the AMD event, Su gave attendees a sneak peek at several forthcoming products:

  • Instinct MI400 Series: This GPU is being designed for both large-scale AI inference and training. It will be the heart of the Helios rack solution (see below) and provide what Su described as “the engine for the next generation of AI.” Expect performance of up to 40 petaflops, 432GB of HBM4 memory, and bandwidth of 19.6TB/sec.
  • Helios: The code name for a unified AI rack solution coming in 2026. As Su explained it, Helios will be a rack configuration that functions like a single AI engine, incorporating AMD’s EPYC CPU, Instinct GPU, Pensando Pollara network interface card (NIC) and ROCm software. Specs include up to 72 GPUs in a rack and 31TB of HBM3 memory.
  • Venice: This is the code name for the next generation of AMD EPYC server CPUs, Su said. They’ll be based on a 2nm form, feature up to 256 cores, and offer a 1.7x performance boost over the current generation.
  • Vulcano: A future NIC, it will be built using a 3nm form and feature speeds of up to 800Gb/sec.

Do More:

 

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

Tech Explainer: What’s a NIC? And how can it empower AI?

Featured content

Tech Explainer: What’s a NIC? And how can it empower AI?

With the acceleration of AI, the network interface card is playing a new, leading role.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The humble network interface card (NIC) is getting a status boost from AI.

At a fundamental level, the NIC enables one computing device to communicate with others across a network. That network could be a rendering farm run by a small multimedia production house, an enterprise-level data center, or a global network like the internet.

From smartphones to supercomputers, most modern devices use a NIC for this purpose. On laptops, phones and other mobile devices, the NIC typically connects via a wireless antenna. For servers in enterprise data centers, it’s more common to connect the hardware infrastructure with Ethernet cables.

Each NIC—or NIC port, in the case of an enterprise NIC—has its own media access control (MAC) address. This unique identifier enables the NIC to send and receive relevant packets. Each packet, in turn, is a small chunk of a much larger data set, enabling it to move at high speeds.

Networking for the Enterprise

At the enterprise level, everything needs to be highly capable and powerful, and the NIC is no exception. Organizations operating full-scale data centers rely on NICs to do far more than just send emails and sniff packets (the term used to describe how a NIC “watches” a data stream, collecting only the data addressed to its MAC address).

Today’s NICs are also designed to handle complex networking tasks onboard, relieving the host CPU so it can work more efficiently. This process, known as smart offloading, relies on several functions:

  • TCP segmentation offloading: This breaks big data into small packets.
  • Checksum offloading: Here, the NIC independently checks for errors in the data.
  • Receive side scaling: This helps balance network traffic across multiple processor cores, preventing them from getting bogged down.
  • Remote Direct Memory Access (RDMA): This process bypasses the CPU and sends data directly to GPU memory.

Important as these capabilities are, they become even more vital when dealing with AI and machine learning (ML) workloads. By taking pressure off the CPU, modern NICs enable the rest of the system to focus on running these advanced applications and processing their scads of data.

This symbiotic relationship also helps lower a server’s operating temperature and reduce its power usage. The NIC does this by increasing efficiency throughout the system, especially when it comes to the CPU.

Enter the AI NIC

Countless organizations both big and small are clamoring to stake their claims in the AI era. Some are creating entirely new AI and ML applications; others are using the latest AI tools to develop new products that better serve their customers.

Either way, these organizations must deal with the challenges now facing traditional Ethernet networks in AI clusters. Remember, Ethernet was invented over 50 years ago.

AMD has a solution: a revolutionary NIC it has created for AI workloads, the AMD AI NIC card. Recently released, this NIC card is designed to provide the intense communication capabilities demanded by AI and ML models. That includes tightly coupled parallel processing, rapid data transfers and low-latency communications.

AMD says its AI NIC offers a significant advancement in addressing the issues IT managers face as they attempt to reconcile the broad compatibility of an aging network technology with modern AI workloads. It’s a specialized network accelerator explicitly designed to optimize data transfer within back-end AI networks for GPU-to-GPU communication.

To address the challenges of AI workloads, what’s needed is a network that can support distributed computing over multiple GPU nodes with low jitter and RDMA. The AMD AI NIC is designed to manage the unique communication patterns of AI workloads and offer high throughput across all available links. It also offers congestion avoidance, reduced tail latency, scalable performance, and fast job-completion times.

Validated NIC

Following rigorous validation by the engineers at Supermicro, the AMD AI NIC is now supported on the Supermicro 8U GPU Server (AS -8126GS-TNMR). This behemoth is designed specifically for AI, deep learning, high-performance computing (HPC), industrial automation, retail and climate modeling.

In this configuration, AMD’s smart AI-focused NIC can offload networking tasks. This lets the Supermicro SuperServer’s dual AMD EPYC 9000-series processors run at even higher efficiency.

In the Supermicro server, the new AMD AI NIC occupies one of the myriad PCI Express x16 slots. Other optional high-performance PCIe cards include a CPU-to-GPU interconnect and up to eight AMD Instinct GPU accelerators.

In the NIC of time

A chain is only as strong as its weakest link. The chain that connects our ever-expanding global network of AI operations is strengthened by the advent of NICs focused on AI.

As NICs grow more powerful, these advanced network interface cards will help fuel the expansion of the AI/ML applications that power our homes, offices, and everything in between. They’ll also help us bypass communication bottlenecks and speed time to market.

For SMBs and enterprises alike, that’s good news indeed.

Do More:

1

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

Meet AMD’s new EPYC CPUs for SMBs—and Supermicro servers that support them

Featured content

Meet AMD’s new EPYC CPUs for SMBs—and Supermicro servers that support them

AMD introduced the AMD EPYC 4005 series processors for SMBs and cloud service providers. And Supermicro announced that the new AMD processors are now shipping in several of its servers.

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD this week introduced the AMD EPYC 4005 series processors. These are purpose-built CPUs designed to bring enterprise-level features and performance to small and medium businesses.

And Supermicro, wasting no time, also announced that several of its servers are now shipping with the new AMD EPYC 4005 CPUs.

EPYC 4005

The new AMD EPYC 4005 series processors are intended for on-prem users and cloud service providers who need powerful but cost-effective solutions in a 3U height form factor.

Target customers include SMBs, departmental and branch-office server users, and hosted IT service providers. Typical workloads for servers powered by the new CPUs will include general-purpose computing, dedicated hosting, code development, retail edge deployments, and content creation, AMD says.

“We’re delivering the right balance of performance, simplicity, and affordability,” says Derek Dicker, AMD’s corporate VP of enterprise and HPC. “That gives our customers and system partners the ability to deploy enterprise-class solutions that solve everyday business challenges.”

The new processors feature AMD’s ‘Zen 5’ core architecture and come in a single-socket package. Depending on model, they offer anywhere from 6 to 16 cores; up to 192GB of dual-channel DDR5 memory; 28 lanes of PCIe Gen 5 connectivity; and boosted performance of up to 5.7 GHz. One model of the AMD EPYC 4005 line also includes integrated AMD 3D V-Cache tech for a larger 128MB L3 cache and lower latency.

On a standard 42U rack, servers powered by AMD EPYC 4005 can provide up to 2,080 cores (that’s 13 3U servers x 10 nodes/server x 16 cores/node). That level of capacity can reduce a user’s size requirements while also lowering their TCO.

The new AMD CPUs follow the AMD EPYC 4004 series, introduced this time last year. The EPYC 4004 processors, still available from AMD, use the same AM5 socket as the 4005s.

Supermicro Servers

Also this week, Supermicro announced that several of its servers are now shipping with the new AMD EPYC 4005 series processors. Supermicro also introduced a new MicroCloud 3U server that’s available in 10-node and 5-node versions, both powered by the AMD EPYC 4005 CPUs.

"Supermicro continues to deliver first-to-market innovative rack-scale solutions for a wide range of use cases,” says Mory Lin, Supermicro’s VP of IoT, embedded and edge computing.

Like the AMD EPYC 4005 CPUs, the Supermicro servers are intended for SMBs, departmental and branch offices, and hosted IT service providers.

The new Supermicro MicroCloud 10-node server features single-socket AMD processors (your choice of either 4004 or the new 4005) as well as support for one single-width GPU accelerator card.

Supermicro’s new 5-node MicroCloud server also offers a choice of AMD EPYC 4004 or 4005 series processor. In contrast to the 10-node server, the 5-node version supports one double-width GPU accelerator card.

Supermicro has also added support for the new AMD EPYC 4005 series processors to several of its existing server lines. These servers include 1U, 2U and tower servers.

Have SMB, branch or hosting customers looking for affordable compute power? Tell them to:

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

To make room for AI, modernize your data center

Featured content

To make room for AI, modernize your data center

A new report finds the latest AMD-powered Supermicro servers can modernize the data center, lowering TCO and making room for AI systems.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Did you know that dramatic improvements in processor power can enable your corporate customers to lower their total cost of ownership (TCO) by consolidate servers and modernizing their data centers?

Server consolidation is a hot topic in the context of AI. Many data centers are full and running with all the power that’s available. So how can they make room for new AI systems? Also, how can they get the kind of power that today’s AI systems require?

One answer: with consolidation. 

Four in One

All this is especially relevant in light of a new report from Principled Technologies.

The report, prepared for AMD, finds that an organization that upgrades to new Supermicro servers powered by the current 5th generation AMD EPYC processors can consolidate servers on a 4:1 ratio.

In other words, the level of performance that previously required four older servers can now be delivered with just one.

Further, Principled found that organizations that make this upgrade can also free up data-center space; lower operating costs by up to $2.8 million over five years; shrink power-consumption levels; and reduce the maintenance load on sys admins.

Testing Procedures

Here’s how Principled figured all this out. To start, they obtained two systems:

Next, Principled’s researchers compared the transactional database performance of the two servers. They did this with HammerDB TPROC-C, an open-source benchmarking tool for online transaction processing (OLTP) workloads.

To ensure the systems were sufficiently loaded, Principled also measured both servers’ CPU and power utilization rates, pushing both servers to 80% CPU core utilization.

Then Principled calculated a consolidation ratio. That is, how many of the older servers would be needed to do the same level of work done by just 1 new server?

Finally, Principled calculated the expected 5-year costs for software licensing, power, space and maintenance. These calculations were made for both the older and new Supermicro servers, so they could be compared.

The Results

So what did Principled find? Here are the key results:

  • Performance upgrades: The new servers, based on AMD 5th Gen EPYC processors, is much more powerful. To match the database performance of just 1 new server, the testers required 4 of the older servers.
  • Lower operating costs: Consolidating those four older servers onto just one new server could lower an organization’s TCO by over 60%, saving up to an estimated $2.8 million over five years. The estimated 5-year TCO for the legacy server was $4.68 million, compared with $1.78 million for the new system.
  • Lower software license costs: Much of the savings would come from consolidating software licenses. They’re typically charged on a per-core basis, and the new test server needed only about a third as many cores as did the four older systems: 96 cores on the new system, compared with a total of 256 cores on the four older servers.
  • Reduced power consumption: To run the same benchmark, the new system needed only about 40% of the power required by the four older servers.
  • Lower space and cooling requirements: Space savings were calculated by comparing data-center footprint costs, taking into account the 4:1 consolidation and rack space needed. Cooling costs were factored in, too. The savings here were pretty dramatic, even if the figures were relatively low. The new system’s space costs were just $476, or 75% lower than the legacy system’s cost of $1,904.
  • Reduced maintenance costs: This was estimated with the assumption that one full-time sys admin with an annual salary of roughly $100K is responsible for 100 servers. The savings here brought a cost of over $26K for the older setup down to about $6,500 for the new, for a reduction of 75%.

Implicit in the results, though not actually calculated, is the way these reductions could also free up funding, floor space and other resources that organizations can then use for new AI systems.

So if your customers are grappling with finding new resources for AI, tell them about these test results. Upgrading to servers based on the latest processors could be the answer.

Do More:

 

 

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

AI across AMD’s entire portfolio? Believe it!

Featured content

AI across AMD’s entire portfolio? Believe it!

A little over a year ago, AMD CTO Mark Papermaster said the company’s strategy was to offer AI everywhere. Now learn how AMD, with help from Supermicro, is bringing this strategy to life.

Learn More about this topic
  • Featured Technologies:

A year in the fast-moving world of artificial intelligence can seem like a lifetime.

Consider:

  • A year ago, ChatGPT had fewer than 200 million weekly active users. Now this Generative AI tool has 400 million weekly users, according to developer OpenAI.
  • A year ago, no one outside of China had heard of DeepSeek. Now its GenAI chatbot is disrupting the AI industry, challenging the way some mainstream tools function.
  • About a year ago, AMD CTO Mark Papermaster said his company’s new strategy called for AI across the entire product portfolio. Now AMD, with help from Supermicro, offers AI power for the data center, cloud and desktop. AMD also offers a robust open AI stack.

‘We’re Thrilled’

AMD’s Papermaster made his comments in Feb. 2024 during a fireside chat hosted by stock research firm Arete Research.

During the interview, CTO Papermaster acknowledged that most early customers for AMD’s AI hardware were mostly big cloud hyperscalers, including AWS, Google Cloud and Microsoft Azure. But he also said new customers are coming, including both enterprises and individual endpoint users.

“We’re thrilled to bring AI across our entire portfolio,” Papermaster said.                                                                          

So how has AMD done? According to the company’s financial results for both the fourth quarter and the full year 2024, pretty good.

Aggressive Investments

During AMD’s recent report on its Q4:24 and full-year ’24 financial results, CFO Jean Hu mentioned that the company is “investing aggressively in AI.” She wasn’t kidding, as the following items show:

  • AMD is accelerating its AI software road map. The company released ROCm 6.3, which includes enhancements for faster AI inferencing on AMD Instinct GPUs. The company also shared an update on its plans for the ROCm software stack.
  • AMD announced a new GPU system in 2024, the AMD Instinct MI325X. Designed for GenAI performance, it’s built on the AMD CDNA3 architecture and offers up to 256GB of HBM3E memory and up to 6TB/sec. of bandwidth.
  • To provide a scalable AI infrastructure, AMD has expanded its partnerships. These partnerships involve companies that include Aleph, IBM, Fujitsu and Vultr. IBM, for one, plans to deploy AMD MI300X GPUs to power GenAI and HPC applications on its cloud offering.
  • AMD is offering AI power for PCs. The company added AI capabilities to its Ryzen line of processors. Dell, among other PC vendors, has agreed to use these AMD CPUs in its Dell Pro notebook and desktop systems.

Supermicro Servers

AMD partner Supermicro is on the AI case, too. The company now offers several AMD-powered servers designed specifically for HPC and AI workloads.

These include an 8U 8-GPU system with AMD Instinct MI300X GPUs. It’s designed to handle some of the largest AI and GenAI models.

There’s also a Supermicro liquid-cooled 2U 4-way server. This system is powered by the AMD Instinct MI300A, which combines CPUs and GPUs, and it’s designed to support workloads that coverge HPC and AI.

Put it all together, and you can see how AMD is implementing AI across its entire portfolio.

Do More:

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

Tech Explainer: What is edge computing — and why does it matter?

Featured content

Tech Explainer: What is edge computing — and why does it matter?

Edge computing, once exotic, is now a core aspect of modern IT infrastructures. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Edge computing is a vital aspect of our modern IT infrastructure. Its use can reduce latency, minimize bandwidth usage, and shorten response times.

This distributed computing methodology enables organizations to process data closer to its source and make decisions faster. This is referred to as operating at the edge.

For contrast, you can compare this with operating at the core, which refers to data being sent to centralized data centers and cloud environments for processing.

The edge is also a big and fast-growing business. Last year, global spending on edge computing rose by 14%, totaling $228 billion, according to market watcher IDC.

Looking ahead, IDC predicts this spend will increase to $378 billion by 2028, for a five-year compound annual growth rate (CAGR) of nearly 18%. Driving this growth will be high demand for real-time analytics, automation and enhanced customer experiences.

How does edge computing work?

Fundamentally, edge computing operates pretty much the same way that other types of computing do. The big difference is the location of the computing infrastructure relative to devices that collect the data.

For instance, a telecommunications provider like Verizon operates at the edge to better serve its customers. Rather than sending customer data to a central location, a telco can process it closer to the source.

An edge node’s proximity to end users can dramatically reduce the time it takes to transfer information to and from each user. This time is referred to as latency. And moving computing to the edge can reduce it. Edge computing can also lower data-error rates and demand for costly data-center space.

For a telco application of edge computing, the flow of data would look something like this:

1.   Users working with their smartphones, PCs and other devices create and request data. Because this happens in their homes, offices or anywhere else they happen to be, the data is said to have been created at the edge.

2.   Next, this customer data is processed by what are known as edge nodes. These are edge computing infrastructure devices placed near primary data sources.

3.   Next, the edge nodes filter the user data with algorithms and AI-enabled processing. Then the nodes send to the cloud only the most relevant data. This helps reduce bandwidth usage and costs.

Edge is Everywhere

Many verticals now rely on edge computing to increase efficiency and better serve their customers. These include energy providers, game developers and IoT appliance manufacturers.

One big vertical for the edge is retail, where major brands rely on edge computing to collect data from shoppers in real time. This helps retailers manage their stock, identify new sales opportunities, reduce shrinkage (that is, theft), and offer unique deals to their customers.

Other areas for the edge include “smart roads.” Here, roadside sensors are used to collect and process data locally to assess traffic conditions and maintenance. In addition, the reduced latency and hyper-locality provided by edge computing can speed communications, paring precious seconds when first responders are called to the scene of an accident.

Inner Workings

Like most modern computers, edge nodes rely on a laundry list of digital components. At the top of that list is a processor like the AMD EPYC Embedded 9004 and 8004 series.

AMD’s latest embedded processors are designed to balance performance and efficiency. The company’s ‘Zen 4’ and ‘Zen 4c’ 5-nanometer core architecture is optimized for always-on embedded systems. And with up to 96 cores operating as fast as 4.15 GHz, these processors can handle the AI-heavy workloads increasingly common to edge computing.

Zoom out from the smallest component to the largest, and you’re likely to find a density- and power-optimized edge platform like the Supermicro H13 WIO.

Systems like these are designed specifically for edge operations. Powered by either AC or DC current for maximum flexibility, the H13 WIO can operate at a scant 80 watts TDP. Yet to handle the most resource-intensive applications, it can scale up to 64 cores.

Getting Edgier

The near future of edge computing promises to be fascinating. As more users sign up for new services, enterprises will have to expand their edge networks to keep up with demand.

What tools will they use? To find out, see the latest edge tech from AMD and Supermicro at this year’s MWC, which kicks off in Barcelona, Spain, on March 3.

Do More:

 

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

AMD Instinct MI300A blends GPU, CPU for super-speedy AI/HPC

Featured content

AMD Instinct MI300A blends GPU, CPU for super-speedy AI/HPC

CPU or GPU for AI and HPC? You can get the best of both with the AMD Instinct MI300A.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The AMD Instinct MI300A is the world’s first data center accelerated processing unit for high-performance computing and AI. It does this by integrating both CPU and GPU cores on a single package.

That makes the AMD Instinct MI300A highly efficient at running both HPC and AI workloads. It also makes the MI300A powerful enough to accelerate training the latest AI models.

Introduced about a year ago, the AMD Instinct MI300A accelerator is shipping soon. So are two Supermicro servers—one a liquid-cooled 2U system, the other an air-cooled 4U—each powered by four MI300A units.

Under the Hood

The technology of the AMD Instinct MI300A is impressive. Each MI300A integrates 24 AMD ‘Zen 4’ x86 CPU cores with 228 AMD CDNA 3 high-throughput GPU compute units.

You also get 128GB of unified HBM3 memory. This presents a single shared address space to CPU and GPU, all of which are interconnected into the coherent 4th Gen AMD Infinity architecture.

Also, the AMD Instinct MI300A is designed to be used in a multi-unit configuration. This means you can connect up to four of them in a single server.

To make this work, each APU has 1 TB/sec. of bidirectional connectivity through eight 128 GB/sec. AMD Infinity Fabric interfaces. Four of the interfaces are dedicated Infinity Fabric links. The other four can be flexibly assigned to deliver either Infinity Fabric or PCIe Gen 5 connectivity.

In a typical four-APU configuration, six interfaces are dedicated to inter-GPU Infinity Fabric connectivity. That supplies a total of 384 GB/sec. of peer-to-peer connectivity per APU. One interface is assigned to support x16 PCIe Gen 5 connectivity to external I/O devices. In addition, each MI300A includes two x4 interfaces to storage, such as M.2 boot drives, plus two USB Gen 2 or 3 interfaces.

Converged Computing

There’s more. The AMD Instinct MI300A was designed to handle today’s convergence of HPC and AI applications at scale.

To meet the increasing demands of AI applications, the APU is optimized for widely used data types. These include FP64, FP32, FP16, BF16, TF32, FP8 and INT8.

The MI300A also supports native hardware sparsity for efficiently gathering data from sparse matrices. This saves power and compute cycles, and it also lowers memory use.

Another element of the design aims at high efficiency by eliminating time-consuming data copy operations. The MI300A can easily offload tasks easily between the CPU and GPU. And it’s all supported by AMD’s ROCm 6 open software platform, built for HPC, AI and machine learning workloads.

Finally, virtualized environments are supported on the MI300A through SR-IOV to share resources with up to three partitions per APU. SR-IOV—short for single-root, input/output virtualization—is an extension of the PCIe spec. It allows a device to separate access to its resources among various PCIe functions. The goal: improved manageability and performance.

Fun fact: The AMD Instinct MI300A is a key design component of the El Capitan supercomputer recently dedicated by Lawrence Livermore Labs. This system can process over two quintillion (1018) calculations per second.

Supermicro Servers

As mentioned above, Supermicro now offers two server systems based on the AMD Instinct MI300A APU. They’re 2U and 4U systems.

These servers both take advantage of AMD’s integration features by combining four MI300A units in a single system. That gives you a total of 912 GPUs, 96 CPUs, and 512GB of HBM3 memory.

Supermicro says these systems can push HPC processing to Exascale levels, meaning they’re very, very fast. “Flop” is short for floating point operations per second, and “exa” indicates a 1 with 18 zeros after it. That’s fast.

Supermicro’s 2U server (model number AS -2145GH-TNMR-LCC) is liquid-cooled and aimed at HPC workloads. Supermicro says direct-to-chip liquid-cooling technology enables a nice TCO with over 51% data center energy cost savings. The company also cites a 70% reduction in fan power usage, compared with air-cooled solutions.

If you’re looking for big HPC horsepower, Supermicro’s got your back with this 2U system. The company’s rack-scale integration is optimized with dual AIOM (advanced I/O modules) and 400G networking. This means you can create a high-density supercomputing cluster with as many as 21 of Supermicro’s 2U systems in a 48U rack. With each system combining four MI300A units, that would give you a total of 84 APUs.

The other Supermicro server (model number AS -4145GH-TNMR) is an air-cooled 4U system, also equipped with four AMD Instinct MI300A accelerators, and it’s intended for converged HPC-AI workloads. The system’s mechanical airflow design keeps thermal throttling at bay; if that’s not enough, the system also has 10 heavy-duty 80mm fans.

Do More:

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

Tech Explainer: CPUs and GPUs for AI training and inferencing

Featured content

Tech Explainer: CPUs and GPUs for AI training and inferencing

Which is best for AI – a CPU or a GPU? Like much in life, it depends.

Learn More about this topic
  • Applications:
  • Featured Technologies:

While central processing units and graphics processing units serve different roles in AI training and inferencing, both roles are vital to AI workloads.

CPUs and GPUs were both invented long before the AI era. But each has found new purpose as the robots conduct more of our day-to-day business.

Each has its tradeoffs. Most CPUs are less expensive than GPUs, and they typically require less electric power. But that doesn’t mean CPUs are always the best choice for AI workloads. Like lots of things in life, it depends.

Two Steps to AI

A typical AI application involves a two-step process. First training. Then inferencing.

Before an AI model can be deployed, it must be trained. That could include suggesting which movie to watch next on Netflix or detecting fake currency in a retail environment.

Once the AI model has been deployed, it can begin the inferencing process. In this stage, the AI application interfaces with users, devices and other models. Then it autonomously makes predictions and decisions based on new input.

For example, Netflix’s recommendation engine is powered by an AI model. The AI was first trained to consider your watching history and stated preferences, as well as to review newly available content. Then the AI employs inferencing—what we might call reasoning—to suggest a new movie or TV show you’re likely to enjoy.

AI Training

GPU architectures like those found in the AMD Instinct MI325X accelerator offers highly parallel processing. In other words, a GPU can perform many calculations simultaneously.

The AMD Instinct MI325X has more than 300 GPU compute units. They make the accelerator faster and more adept at both processing large datasets and handling the repetitious numerical operations common to the training process.

These capabilities also mean GPUs can accelerate the training process. That’s especially true for large models, such as those that underpin the networks used for deep learning.

CPUs, by contrast, excel at general-purpose tasks. Compared with a GPU, a CPU will be better at completing sequential tasks that require logic or decision-making. For this reason, a CPU’s role in AI training is mostly limited to data preprocessing and coordinating GPU tasks.

AI Inferencing

However, when it comes to AI inferencing, CPUs play a much more significant role. Often, inferencing can be a relatively lightweight workload, because it’s not highly parallel. A good example is the AI capability present in modern edge devices such as the latest iOS and Android smartphones.

As mentioned above, the average CPU also consumes less power than a GPU. That makes a CPU a better choice in situations where heat and battery life are important.

However, not all inferencing applications are lightweight, and such workloads may not be appropriate for CPUs. One example is autonomous vehicles. They will require massive parallel processing in real-time to ensure safety and optimum efficiency.

In these cases, GPUs will play a bigger role in the AI inferencing process, despite their higher cost and power requirements.

Powerful GPUs are already used for AI inferencing at the core. Examples include large-scale cloud services such as AWS, Google Cloud and Microsoft Azure.

Enterprise Grade

Enterprises often conduct AI training and inferencing on a scale so massive, it eclipses those found in edge environments. In these cases, IT engineers must rely on hugely powerful systems.

One example is the Supermicro AS -8125GS-TNMR2 server. This 8U behemoth—weighing in at 225 pounds—can operate up to eight AMD Instinct MI300X accelerators. And it’s equipped with dual AMD EPYC processors, the customer’s choice of either the 9004 or 9005 series.

To handle some of the world’s most demanding AI workloads, Supermicro’s server is packed with an astonishing amount of tech. In addition to its eight GPUs, the server also has room for a pair of AMD EPYC 9005-series processors, 6TB of ECC DDR5 memory, and 18 hot-swap 2.5-inch NVMe and SATA drives.

That makes the Supermicro system one of the most capable and powerful servers now available. And as AI evolves, tech leaders including AMD and Supermicro will undoubtedly produce more powerful CPUs, GPUs and servers to meet the growing demand.

What will the next generation of AI training and inferencing technology look like? To find out, you won’t have to wait long.

Do More:

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

2024: A look back at the year’s best

Featured content

2024: A look back at the year’s best

Let's look back at 2024, a year when AI was everywhere, AMD introduced its 5th Gen EPYC processors, and Supermicro led with liquid cooling.

Learn More about this topic
  • Applications:
  • Featured Technologies:

You couldn't call 2024 boring.

If anything, the year was almost too exciting, too packed with important events, and moving much too fast.

Looking back, a handful of 2024’s technology events stand out. Here are a few of our favorite things.

AI Everywhere

In March AMD’s chief technology officer, Mark Papermaster, made some startling predictions that turned out to be absolutely true.

Speaking at an investors’ event sponsored by Arete Research, Papermaster said, “We’re thrilled to bring AI across our entire product portfolio.” AMD has indeed done that, offering AI capabilities from PCs to servers to high-performance GPU accelerators.

Papermaster also said the buildout of AI is an event as big as the launch of the internet. That certainly sounds right.

He also said AMD believes the total addressable market for AI through 2027 to be $400 billion. If anything, that was too conservative. More recently, consultants Bain & Co. predicted that figure will reach $780 billion to $990 billion.

Back in March, Papermaster said AMD had increased its projection for full-year AI sales from $2 billion to $3.5 billion. That’s probably too low, too.

AMD recently reported revenue of $3.5 billion for its data-center group for just the third quarter alone. The company attributed at least some of the group’s 122% year-on-year increase to the strong ramp of AMD Instinct GPU shipments.

5th Gen AMD EPYC Processors

October saw AMD introduce the fifth generation of its powerful line of EPYC server processors.

The 5th Gen AMD EPYC processors use the company’s new ‘Zen 5’ core architecture. It includes over 25 SKUs offering anywhere from 8 to 192 cores. And the line includes a model—the AMD EPYC 9575F—designed specifically to work with GPU-powered AI solutions.

The market has taken notice. During the October event, AMD CEO Lisa Su told the audience that nearly one in three servers worldwide (34%) are now powered by AMD EPYC processors. And Supermicro launched its new H14 line of servers that will use the new EPYC processors.

Supermicro Liquid Cooling

As servers gain power to add AI and other compute-intensive capabilities, they also run hotter. For data-center operators, that presents multiple challenges. One big one is cost: air conditioning is expensive. What’s more, AC may be unable to cool the new generation of servers.

Supermicro has a solution: liquid cooling. For some time, the company has offered liquid cooling as a data-center option.

In November the company took a new step in this direction. It announced a server that comes with liquid cooling only.

The server in question is the Supermicro 2U 4-node FlexTwin, model number AS -2126FT-HE-LCC. It’s a high-performance, hot-swappable, high-density compute system designed for HPC workloads.

Each 2U system comprises 4 nodes, and each node is powered by dual AMD EPYC 9005 processors. (The previous-gen AMD EPYC 9004s are supported, too.)

To keep cool, the FlexTwin server uses a direct-to-chip (D2C) cold plate liquid cooling setup. Each system also runs 16 counter-rotating fans. Supermicro says this cooling arrangement can remove up to 90% of server-generated heat.

AMD Instinct MI325X Accelerator

A big piece of AMD’s product portfolio for AI is its Instinct line of accelerators. This year the company promised to maintain a yearly cadence of new Instinct models.

Sure enough, in October the company introduced the AMD Instinct MI325X Accelerator. It’s designed for Generative AI performance and working with large language models (LLMs). The system offers 256GB of HBM3E memory and up to 6TB/sec. of memory bandwidth.

Looking ahead, AMD expects to formally introduce the line’s next member, the AMD Instinct MI350, in the second half of next year. AMD has said the new accelerator will be powered by a new AMD CDNA 4 architecture, and will improve AI inferencing performance by up to 35x compared with the older Instinct MI300.

Supermicro Edge Server

A lot of computing now happens at the edge, far beyond either the office or corporate data center.

Even more edge computing is on tap. Market watcher IDC predicts double-digit growth in edge-computing spending through 2028, when it believes worldwide sales will hit $378 billion.

Supermicro is on it. At the 2024 MWC, held in February in Barcelona, the company introduced an edge server designed for the kind of edge data centers run by telcos.

Known officially as the Supermicro A+ Server AS -1115SV-WTNRT, it’s a 1U short-depth server powered by a single AMD EPYC 8004 processor with up to 64 cores. That’s edgy.

Happy Holidays from all of us at Performance Intensive Computing. We look forward to serving you in 2025.

Check out these related blog posts:

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

Faster is better. Supermicro with 5th Gen AMD is faster

Featured content

Faster is better. Supermicro with 5th Gen AMD is faster

Supermicro servers powered by the latest AMD processors are up to 9 times faster than a previous generation, according to a recent benchmark.

Learn More about this topic
  • Applications:
  • Featured Technologies:

When it comes to servers, faster is just about always better.

With faster processors, workloads get completed in less time. End users get their questions answered sooner. Demanding high-performance computing (HPC) and AI applications run more smoothly. And multiple servers get all their jobs done more rapidly.

And if you’ve installed, set up or managed one of these faster systems, you’ll look pretty smart.

That’s why the latest benchmark results from Supermicro are so impressive, and also so important.

The tests show that Supermicro servers powered by the latest AMD processors are up to 9 times faster than a previous generation. These systems can make your customer happy—and make you look good.

SPEC Check

The benchmark in question are those of the Standard Performance Evaluation Corp., better known as SPEC. It’s a nonprofit consortium that sets benchmarks for running complete applications.

Supermicro ran its servers on SPEC’s CPU 2017 benchmark, a suite of 43 benchmarks that measures and compare compute-intensive performance. All of them stress a system’s CPU, memory subsystem and compiler—emphasizing all three of these components working together, not just the processor.

To provide a comparative measure of integer and floating-point compute-intensive performance, the benchmark uses two main metrics. The first is speed, or how much time a server needs to complete a single task. The second is throughput, in which the server runs multiple concurrent copies.

The results are given as comparative scores. In general, higher is better.

Super Server

The server tested was the Supermicro H14 Hyper server, model number AS 2126HS-TN. It’s powered by dual AMD EPYC 9965 processors and loaded with 1.5TB of memory.

This server has been designed for applications that include HPC, cloud computing, AI inferencing and machine learning.

In the floating-point measure, the new server, when compared with a SMC server powered by an earlier-gen AMD EPYC 7601, was 8x faster.

In the Integer Rate measure, compared with a circa 2018 SMC server, it’s almost 9x faster.

Impressive results. And remember, when it comes to servers, faster is better.

Do More:

 

Featured videos


Events




Find AMD & Supermicro Elsewhere

Related Content

Pages