Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Healthcare in the spotlight: Big challenges, big tech

Featured content

Healthcare in the spotlight: Big challenges, big tech

To meet some of their industry’s toughest challenges, healthcare providers are turning to advanced technology.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Healthcare providers face some tough challenges. Advanced technology can help.

As a recent report from consultants McKinsey & Co. points out, healthcare providers are dealing with some big challenges. These include rising costs, workforce shortages, an aging population, and increased competition from nontraditional parties.

Another challenge: Consumers expect their healthcare providers to offer new capabilities, such as digital scheduling and telemedicine, as well as better experiences.

One way healthcare providers hope to meet these two challenge streams is with advanced technology. Three-quarters of U.S. healthcare providers increased their IT spending in the last year, according to a survey conducted by consultants Bain & Co. The same survey found that 15% of healthcare providers already have an AI strategy in place, up from just 5% who had a strategy in 2023.

Generative AI is showing potential, too. Another survey, this one done by McKinsey, finds that over 70% of healthcare organizations are now either pursuing GenAI proofs-of-concept or are already implementing GenAI solutions.

Dynamic Duo

There’s a catch to all this: As healthcare providers adopt AI, they’re finding that the required datasets and advanced analytics don’t run well on their legacy IT systems.

To help, Supermicro and AMD are working together. They’re offering healthcare providers heavy-duty compute delivered at rack scale.

Supermicro servers powered by AMD Instinct MI300X GPUs are designed to accelerate AI and HPC workloads in healthcare. They offer the levels of performance, density and efficiency healthcare providers need to improve patient outcomes.

The AMD Instinct MI300X is designed to deliver high performance for GenAI workloads and HPC applications. It’s designed with no fewer than 304 high-throughput compute units. You also get AI-specific functions and 192GB of HBM3 memory, all of it based on AMD’s CDNA 3 architecture.

Healthcare providers can use Supermicro servers powered by AMD GPUs for next-generation research and treatments. These could include advanced drug discovery, enhanced diagnostics and imaging, risk assessments and personal care, and increased patient support with self-service tools and real-time edge analytics.

Supermicro points out that its servers powered by AMD Instinct GPUs deliver massive compute with rack-scale flexibility, as well as high levels of power efficiency.

Performance:

  • The powerful combination of CPUs, GPUs and HBM3 memory accelerates HPC and AI workloads.
  • HBM3 memory offers capacities of up to 192GB dedicated to the GPUs.
  • Complete solutions ship pre-validated, ready for instant deployment.
  • Double-precision power can serve up to 163.4 TFLOPS.

Flexibility:

  • Proven AI building-block architecture streamlines deployment at scale for the largest AI models.
  • An open AI ecosystem with AMD ROCm open software.
  • A unified computing platform with AMD Instinct MI300X plus AMD Infinity fabric and infrastructure.
  • Thanks to a modular design and build, users move faster to the correct configuration.

Efficiency:

  • Dual-zone cooling innovation, used by some of the most efficient supercomputers on the Green500 supercomputer list.
  • Improved density with 3rd Gen AMD CDNA, delivering 19,456 stream cores.
  • Chip-level power intelligence enables the AMD Instinct MI300X to deliver big power performance.
  • Purpose-built silicon design of the 3rd Gen AMD CDNA combines 5nm and 6nm fabrication processes.

Are your healthcare clients looking to unleash the potential of their data? Then tell them about Supermicro systems powered by the AMD MI300X GPUs.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Tech Explainer: What is Quantum Computing, and How Does It Work?

Featured content

Tech Explainer: What is Quantum Computing, and How Does It Work?

Quantum computing promises to solve problems faster by simultaneously investigating many possible solutions at once. That’s far easier said than done.

Learn More about this topic
  • Applications:

Quantum computing has the potential to alter life as we know it. If, that is, we can figure out how to make the technology work on a massive scale.

This emerging technology is full of promise. At least in theory, it’s powerful enough to help us cure our most insidious diseases, usher in an era of artificial general intelligence (AGI), and enable us to explore neighboring galaxies.

Way, Way Faster

Quantum computing offers a way to solve these kinds of highly complex problems by simultaneously investigating many possible solutions at once.

To understand why this is so important, imagine a robot that’s attempting to find its way through an enormous maze. First, the robot acts as a human might, investigating each possible route, one at a time. Because the maze is so big and has so many possible pathways, this method could take the robot days, weeks or even years to complete.

Now imagine that instead, the robot can instantaneously clone itself, sending each new instance to investigate a potential route. This method would produce results many orders of magnitude faster than the one-at-a-time method.

And that is the promise offered by quantum computing.

Quantum Mechanics

To do all this heavy lifting, quantum computers behave in ways that may seem mysterious.

As you probably know, today’s standard computers operate using bits—binary switches that at any given moment have a value of either 0 or 1. But quantum computers run differently. They employ qubits (short for quantum bits), each of which can represent 0, 1—or both at the same time.

The ability of a particle-based object to be in two states at once? Yes. It’s a fundamental aspect of quantum mechanics known as superimposition.

Leveraging this ability at the bit level enables quantum computers to significantly reduce the time they need to solve problems. Particularly valuable examples of this include defeating encryption, decoding human physiology, even theorizing the mechanics of light-speed travel.

In other words, Star Trek stuff, pure and simple.

Not So Fast?

So why can’t you buy a quantum computer from your local BestBuy? Turns out that many factors have kept the promise of quantum computing just out of reach.

One of the most prevalent is errors at the qubit level. Qubits have a nasty habit of exchanging information with their environment.

By analogy, imagine spinning a basketball on your fingertip, Harlem Globetrotter style. The fast-spinning ball exists in a delicate state. Even tiny disturbances—such as air currents or ambient vibrations—could make the ball wobble and eventually fall.

A similar situation exists for quantum computers. Small environmental inconsistencies can impact qubits on an exponential scale. In fact, the more qubits you use, the more errors you get. Cross a certain threshold, and eventually the number of errors renders a quantum computer no more powerful than today’s standard computers.

Engineers are making progress in their efforts to solve this problem. For example, a French startup with the unlikely name of Alice & Bob was recently funded to the tune of €100 million to develop a new approach to quantum error correction.

Similarly, Google recently announced Willow, a new quantum computing chip the company says can reduce errors exponentially as it scales up. If a recent blog post by Hartmut Neven, lead of Google Quantum AI, is right, then it would seem Google has solved a 30-year-old challenge in quantum error correction.

The Key: R&D

AMD is also attempting to knock down some common quantum computing roadblocks.

The company filed a patent in 2021 titled “Look Ahead Teleportation for Reliable Computation in Multi-SIMD Quantum Processor.” AMD says this breakthrough improves quantum computing system reliability and reduces the number of required qubits. These efforts could revolutionize quantum computing scalability and error correction.

AMD has also created the Zynq UltraScale+ RFSoC, the industry’s only single-chip adaptable radio platform. The Zynq creates high-accuracy, high-speed pulse sequences to control qubits.

Companies like AMD partner Riverlane are using this cutting-edge technology to better control qubits and reduce errors.

When Will We Be There?

Not even a quantum computer can predict the future. But some experts say we could still be 10 to 20 years away from deploying quantum computing on a scale comparable to the ubiquity of the computers we use today.

In the near term, the most powerful tech companies—including AMD and Supermicro—will be working to harness the massive power of qubits.

To achieve their loftiest goals, however, they’ll need to revolutionize scalability and error correction. Only then can we deploy not just hundreds of qubits, but millions.

Once that code is cracked, there’s no telling where we’ll go from there.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Research Roundup: IT & cloud infrastructure spending rise, tech jobs stay strong, 2 security threats worsen

Featured content

Research Roundup: IT & cloud infrastructure spending rise, tech jobs stay strong, 2 security threats worsen

Catch up on the latest IT industry trends and statistics from leading market watchers and analysts.

Learn More about this topic
  • Applications:

Three of every four CFOs plan to increase their organizations’ IT spending this year. Spending on cloud infrastructure services rose 20% last year. Unemployment among IT workers is lower than the national average. And two types of cyber attacks are bigger threats than ever.

That’s some of the latest from leading IT industry watchers and researchers. And here’s your Performance Intensive Computing roundup.

CFOs: More IT Spending

If it’s true that a rising tide lifts all boats, you might prepare to set sail now. A new survey finds that a majority of corporate CFOs plan to boost their technology budgets this year.

The survey, conducted this past fall by research group Gartner, reached just over 300 CFOs and other senior finance leaders. Gartner published its findings this month, and they include:

  • Over three-quarters of CFOs surveyed (77%) plan to boost spending in the technology category this year.
  • Nearly half the CFOs (47%) plan to increase technology spending by 10% or more this year compared with last year.
  • Nearly a third (30%) plan to increase technology spending by 4% to 9% year-on-year.
  • And fewer than one in 10 CFOs (9%) plan to decrease technology spending this year.

Cloud Infrastructure: Spending Rises

One of those lifted ships: cloud infrastructure services.

In the fourth quarter of 2024, global spending on these services rose 20% year-on-year, according to new metrics from market watcher Canalys.

Global spending for the full year also rose 20%, Canalys said. Spending on cloud infrastructure services hit $321.3 billion last year, up from $267.7 billion in 2023.

The key driver of the growth? That would be AI. The technology “significantly accelerated” cloud adoption, Canalys says.

Looking ahead, Canalys expects global spending on cloud infrastructure services this year to rise by another 19%.

Tech Employment: Mostly Strong

Also on an upswing: technology employment.

New figures from the U.S. Bureau of Labor show that across all sectors of the U.S. economy, tech occupations grew by about 228,000 jobs.

Within the tech industry alone, the picture was more mixed. More than 13,700 jobs were filled in IT services and software development, but in telecom, 7,900 workers lost their jobs.

Tech is still a good industry to work in. The industry’s unemployment rate in January was 2.9%, compared with a national rate of 4%.

“Tech hiring activity was solid across the key categories,” says Tim Herbert, chief research officer at CompTIA, an industry trade group. “Employers continue to balance the need for foundational tech talent and skills with the push into next-gen fields.”

Security: Phishing, DDoS Both Worsen

Two kinds of cyber threats are getting worse:

  • The number of phishing attempts blocked worldwide last year by Kaspersky rose 26% over the previous year.
  • Distributed Denial of Services (DDoS) attacks increased by 82% last year, according to a new report from Zayo Group.

Kaspersky, a cybersecurity and digital privacy company, says it blocked more than 893 million phishing attempts last year, up from 710 million in 2023.

In many instances, the attackers mimicked the websites and social media feeds of well-known brands, including Airbnb, Booking and TikTok. Others falsely presented product giveaways from celebrities. In one, actress Jennifer Aniston was falsely shown promoting a giveaway of 10,000 laptop computers — a giveaway that did not exist.

Separately, Zayo Group, a provider of communications infrastructure, has published its biannual DDoS insights report, and the findings aren’t pretty. The attack volume rose from 90,000 incidents in 2023 to 165,000 incidents last year.

In a DDoS attack, the bad guys make a machine or network resource unavailable by disrupting the services of a host connected to a network. Often they do this by flooding the target system with requests, overloading the system and preventing requests that are legit from being fulfilled

In one worrisome change, the bad guys are increasing the scale of their DDoS attacks by using large botnets, compromised IoT devices and AI.

“The sophistication of DDoS attacks continues to grow,” says Max Clauson, a senior VP at Zayo. “Cybercriminals are finding ways to exploit cloud services, higher-bandwidth availability, and new vulnerabilities in software and network protocols.”

Also, Zayo finds the targets of DDoS attacks are shifting:

  • Telecom is still the most targeted sector, representing 42% of all observed incidents. But that’s down from 48% in 2023.
  • Attacks on the finance industry grew. In 2023 finance represented just 3.5% of all observed instances. In 2024 that doubled to 7%.
  • In healthcare, the total number of DDoS attacks more than tripled from 2023 to 2024, rising by a whopping 223%.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Tech Explainer: What is edge computing — and why does it matter?

Featured content

Tech Explainer: What is edge computing — and why does it matter?

Edge computing, once exotic, is now a core aspect of modern IT infrastructures. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Edge computing is a vital aspect of our modern IT infrastructure. Its use can reduce latency, minimize bandwidth usage, and shorten response times.

This distributed computing methodology enables organizations to process data closer to its source and make decisions faster. This is referred to as operating at the edge.

For contrast, you can compare this with operating at the core, which refers to data being sent to centralized data centers and cloud environments for processing.

The edge is also a big and fast-growing business. Last year, global spending on edge computing rose by 14%, totaling $228 billion, according to market watcher IDC.

Looking ahead, IDC predicts this spend will increase to $378 billion by 2028, for a five-year compound annual growth rate (CAGR) of nearly 18%. Driving this growth will be high demand for real-time analytics, automation and enhanced customer experiences.

How does edge computing work?

Fundamentally, edge computing operates pretty much the same way that other types of computing do. The big difference is the location of the computing infrastructure relative to devices that collect the data.

For instance, a telecommunications provider like Verizon operates at the edge to better serve its customers. Rather than sending customer data to a central location, a telco can process it closer to the source.

An edge node’s proximity to end users can dramatically reduce the time it takes to transfer information to and from each user. This time is referred to as latency. And moving computing to the edge can reduce it. Edge computing can also lower data-error rates and demand for costly data-center space.

For a telco application of edge computing, the flow of data would look something like this:

1.   Users working with their smartphones, PCs and other devices create and request data. Because this happens in their homes, offices or anywhere else they happen to be, the data is said to have been created at the edge.

2.   Next, this customer data is processed by what are known as edge nodes. These are edge computing infrastructure devices placed near primary data sources.

3.   Next, the edge nodes filter the user data with algorithms and AI-enabled processing. Then the nodes send to the cloud only the most relevant data. This helps reduce bandwidth usage and costs.

Edge is Everywhere

Many verticals now rely on edge computing to increase efficiency and better serve their customers. These include energy providers, game developers and IoT appliance manufacturers.

One big vertical for the edge is retail, where major brands rely on edge computing to collect data from shoppers in real time. This helps retailers manage their stock, identify new sales opportunities, reduce shrinkage (that is, theft), and offer unique deals to their customers.

Other areas for the edge include “smart roads.” Here, roadside sensors are used to collect and process data locally to assess traffic conditions and maintenance. In addition, the reduced latency and hyper-locality provided by edge computing can speed communications, paring precious seconds when first responders are called to the scene of an accident.

Inner Workings

Like most modern computers, edge nodes rely on a laundry list of digital components. At the top of that list is a processor like the AMD EPYC Embedded 9004 and 8004 series.

AMD’s latest embedded processors are designed to balance performance and efficiency. The company’s ‘Zen 4’ and ‘Zen 4c’ 5-nanometer core architecture is optimized for always-on embedded systems. And with up to 96 cores operating as fast as 4.15 GHz, these processors can handle the AI-heavy workloads increasingly common to edge computing.

Zoom out from the smallest component to the largest, and you’re likely to find a density- and power-optimized edge platform like the Supermicro H13 WIO.

Systems like these are designed specifically for edge operations. Powered by either AC or DC current for maximum flexibility, the H13 WIO can operate at a scant 80 watts TDP. Yet to handle the most resource-intensive applications, it can scale up to 64 cores.

Getting Edgier

The near future of edge computing promises to be fascinating. As more users sign up for new services, enterprises will have to expand their edge networks to keep up with demand.

What tools will they use? To find out, see the latest edge tech from AMD and Supermicro at this year’s MWC, which kicks off in Barcelona, Spain, on March 3.

Do More:

 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

AMD Instinct MI300A blends GPU, CPU for super-speedy AI/HPC

Featured content

AMD Instinct MI300A blends GPU, CPU for super-speedy AI/HPC

CPU or GPU for AI and HPC? You can get the best of both with the AMD Instinct MI300A.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The AMD Instinct MI300A is the world’s first data center accelerated processing unit for high-performance computing and AI. It does this by integrating both CPU and GPU cores on a single package.

That makes the AMD Instinct MI300A highly efficient at running both HPC and AI workloads. It also makes the MI300A powerful enough to accelerate training the latest AI models.

Introduced about a year ago, the AMD Instinct MI300A accelerator is shipping soon. So are two Supermicro servers—one a liquid-cooled 2U system, the other an air-cooled 4U—each powered by four MI300A units.

Under the Hood

The technology of the AMD Instinct MI300A is impressive. Each MI300A integrates 24 AMD ‘Zen 4’ x86 CPU cores with 228 AMD CDNA 3 high-throughput GPU compute units.

You also get 128GB of unified HBM3 memory. This presents a single shared address space to CPU and GPU, all of which are interconnected into the coherent 4th Gen AMD Infinity architecture.

Also, the AMD Instinct MI300A is designed to be used in a multi-unit configuration. This means you can connect up to four of them in a single server.

To make this work, each APU has 1 TB/sec. of bidirectional connectivity through eight 128 GB/sec. AMD Infinity Fabric interfaces. Four of the interfaces are dedicated Infinity Fabric links. The other four can be flexibly assigned to deliver either Infinity Fabric or PCIe Gen 5 connectivity.

In a typical four-APU configuration, six interfaces are dedicated to inter-GPU Infinity Fabric connectivity. That supplies a total of 384 GB/sec. of peer-to-peer connectivity per APU. One interface is assigned to support x16 PCIe Gen 5 connectivity to external I/O devices. In addition, each MI300A includes two x4 interfaces to storage, such as M.2 boot drives, plus two USB Gen 2 or 3 interfaces.

Converged Computing

There’s more. The AMD Instinct MI300A was designed to handle today’s convergence of HPC and AI applications at scale.

To meet the increasing demands of AI applications, the APU is optimized for widely used data types. These include FP64, FP32, FP16, BF16, TF32, FP8 and INT8.

The MI300A also supports native hardware sparsity for efficiently gathering data from sparse matrices. This saves power and compute cycles, and it also lowers memory use.

Another element of the design aims at high efficiency by eliminating time-consuming data copy operations. The MI300A can easily offload tasks easily between the CPU and GPU. And it’s all supported by AMD’s ROCm 6 open software platform, built for HPC, AI and machine learning workloads.

Finally, virtualized environments are supported on the MI300A through SR-IOV to share resources with up to three partitions per APU. SR-IOV—short for single-root, input/output virtualization—is an extension of the PCIe spec. It allows a device to separate access to its resources among various PCIe functions. The goal: improved manageability and performance.

Fun fact: The AMD Instinct MI300A is a key design component of the El Capitan supercomputer recently dedicated by Lawrence Livermore Labs. This system can process over two quintillion (1018) calculations per second.

Supermicro Servers

As mentioned above, Supermicro now offers two server systems based on the AMD Instinct MI300A APU. They’re 2U and 4U systems.

These servers both take advantage of AMD’s integration features by combining four MI300A units in a single system. That gives you a total of 912 GPUs, 96 CPUs, and 512GB of HBM3 memory.

Supermicro says these systems can push HPC processing to Exascale levels, meaning they’re very, very fast. “Flop” is short for floating point operations per second, and “exa” indicates a 1 with 18 zeros after it. That’s fast.

Supermicro’s 2U server (model number AS -2145GH-TNMR-LCC) is liquid-cooled and aimed at HPC workloads. Supermicro says direct-to-chip liquid-cooling technology enables a nice TCO with over 51% data center energy cost savings. The company also cites a 70% reduction in fan power usage, compared with air-cooled solutions.

If you’re looking for big HPC horsepower, Supermicro’s got your back with this 2U system. The company’s rack-scale integration is optimized with dual AIOM (advanced I/O modules) and 400G networking. This means you can create a high-density supercomputing cluster with as many as 21 of Supermicro’s 2U systems in a 48U rack. With each system combining four MI300A units, that would give you a total of 84 APUs.

The other Supermicro server (model number AS -4145GH-TNMR) is an air-cooled 4U system, also equipped with four AMD Instinct MI300A accelerators, and it’s intended for converged HPC-AI workloads. The system’s mechanical airflow design keeps thermal throttling at bay; if that’s not enough, the system also has 10 heavy-duty 80mm fans.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Research Roundup: AI edition

Featured content

Research Roundup: AI edition

Catch up on the latest AI trends spotted by leading IT market watchers.

Learn More about this topic
  • Applications:

Spending on artificial intelligence infrastructure is exploding. So is spending on AI for supply chains. But disappointing results on early GenAI tests is causing some CIOs to worry about the ROI.

That’s some of the latest intelligence from leading IT market watchers and researchers. And here’s your research roundup.

AI Infrastructure: $100B and Beyond

Behind every AI implementation is the need for high-end infrastructure. And spending on this type of equipment is expected to grow rapidly.

Market watcher IDC predicts that global spending on AI infrastructure will exceed $100 billion by 2028. Last year this spending totaled roughly $70 billion.

The AI infrastructure market has enjoyed double-digit growth for the last four and a half years, driven primarily by investments in servers, IDC says. In the first half of 2024, servers accounted for nearly 90% of all AI infrastructure spending.

Covered in IDC’s definition of AI infrastructure are servers and storage used for AI platforms, AI and AI-enabled applications, and AI applications development & deployment software.

AI Servers: $200B

That could be only the tip of the iceberg. Gartner researchers now predict worldwide spending on AI-optimized servers will top $200 billion this year. They also say that’s more than double what’s expected to be spent on more traditional servers.

About 70% of that $200 billion will be spent not by end users, but instead by big IT services companies and hyperscalers, Gartner expects. By 2028, the hyperscalers—large cloud providers including AWS, Google Cloud and Microsoft Azure—will operate AI-optimized servers collectively worth about $1 trillion.

Worth noting: This AI spending is part of an even bigger trend. Gartner predicts overall IT spending will rise this year by nearly 10%, reaching a global total of $5.6 trillion.

AI for Supply Chain: Huge

The use of AI in supply chain management is growing at a super-fast compound annual growth rate (CAGR) of 30%. This spending will jump from $3.5 billion in 2023 to $22.7 billion by 2030, according to a new forecast from ResearchAndMarkets.

Supply chain health became a major concern during the pandemic. Now companies realize they need supply chains that are resilient, adaptable and efficient. And AI can help.

The fastest-growing supply chain sector for AI is expected to be forecasting. There, AI can be used to predict future demand for various products. These forecasts can then be used by manufacturers and their partners to optimize inventories and production plans.

GenAI: Where’s the Value?

This year, Generative AI will fail to create its expected value, predicts ABI Research.

Many GenAI proof-of-concept trials have been disappointing, with failure rates as high as 80% to 90%, ABI says. This is seriously cooling some red-hot expectations.

As a result, some enterprise CIOs will turn away from GenAI. Instead, ABI expects, they’ll adopt more traditional AI approaches that solve business problems and deliver a clearer ROI.

ABI’s jaundiced view of GenAI gets some support from Gartner. In its 2025 IT market forecast, Gartner says GenAI is sliding toward the “trough of disillusionment.”

That phrase comes from Gartner’s Hype Cycle. It states that most innovations progress through a pattern of over-enthusiasm and disillusionment, followed by eventual productivity.

While businesses may still be searching for GenAI’s ROI, a growing number of teens are certainly finding it. About one in four U.S. teens (26%) used ChatGPT for schoolwork last year, according to a new Pew Research Center survey. That’s double the percentage of teens who did so in 2023.

AI’s New Mandate: Trust

A much more positive view comes from Accenture’s 25th annual technology vision report. The consulting firm's report says AI is accelerating across enterprises faster than any other prior technology.

What’s more, nearly 70% of executives polled by Accenture said they believe AI brings new urgency to re-invention and how tech systems and processes are designed, built and run.

An even bigger group, 80% of those polled, told Accenture that natural language processing (NLP) will increase collaboration between humans and robots.

One possible barrier to AI progress is the matter of trust. More than 75% of the executives polled by Accenture believe AI’s true benefits must be built on a foundation of trust.

Accenture CEO Julie Sweet agrees. “Unlocking the benefits of AI,” she says, “will only be possible if leaders seize the opportunity to inject and develop trust in its performance and outcomes.” 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Tech Explainer: CPUs and GPUs for AI training and inferencing

Featured content

Tech Explainer: CPUs and GPUs for AI training and inferencing

Which is best for AI – a CPU or a GPU? Like much in life, it depends.

Learn More about this topic
  • Applications:
  • Featured Technologies:

While central processing units and graphics processing units serve different roles in AI training and inferencing, both roles are vital to AI workloads.

CPUs and GPUs were both invented long before the AI era. But each has found new purpose as the robots conduct more of our day-to-day business.

Each has its tradeoffs. Most CPUs are less expensive than GPUs, and they typically require less electric power. But that doesn’t mean CPUs are always the best choice for AI workloads. Like lots of things in life, it depends.

Two Steps to AI

A typical AI application involves a two-step process. First training. Then inferencing.

Before an AI model can be deployed, it must be trained. That could include suggesting which movie to watch next on Netflix or detecting fake currency in a retail environment.

Once the AI model has been deployed, it can begin the inferencing process. In this stage, the AI application interfaces with users, devices and other models. Then it autonomously makes predictions and decisions based on new input.

For example, Netflix’s recommendation engine is powered by an AI model. The AI was first trained to consider your watching history and stated preferences, as well as to review newly available content. Then the AI employs inferencing—what we might call reasoning—to suggest a new movie or TV show you’re likely to enjoy.

AI Training

GPU architectures like those found in the AMD Instinct MI325X accelerator offers highly parallel processing. In other words, a GPU can perform many calculations simultaneously.

The AMD Instinct MI325X has more than 300 GPU compute units. They make the accelerator faster and more adept at both processing large datasets and handling the repetitious numerical operations common to the training process.

These capabilities also mean GPUs can accelerate the training process. That’s especially true for large models, such as those that underpin the networks used for deep learning.

CPUs, by contrast, excel at general-purpose tasks. Compared with a GPU, a CPU will be better at completing sequential tasks that require logic or decision-making. For this reason, a CPU’s role in AI training is mostly limited to data preprocessing and coordinating GPU tasks.

AI Inferencing

However, when it comes to AI inferencing, CPUs play a much more significant role. Often, inferencing can be a relatively lightweight workload, because it’s not highly parallel. A good example is the AI capability present in modern edge devices such as the latest iOS and Android smartphones.

As mentioned above, the average CPU also consumes less power than a GPU. That makes a CPU a better choice in situations where heat and battery life are important.

However, not all inferencing applications are lightweight, and such workloads may not be appropriate for CPUs. One example is autonomous vehicles. They will require massive parallel processing in real-time to ensure safety and optimum efficiency.

In these cases, GPUs will play a bigger role in the AI inferencing process, despite their higher cost and power requirements.

Powerful GPUs are already used for AI inferencing at the core. Examples include large-scale cloud services such as AWS, Google Cloud and Microsoft Azure.

Enterprise Grade

Enterprises often conduct AI training and inferencing on a scale so massive, it eclipses those found in edge environments. In these cases, IT engineers must rely on hugely powerful systems.

One example is the Supermicro AS -8125GS-TNMR2 server. This 8U behemoth—weighing in at 225 pounds—can operate up to eight AMD Instinct MI300X accelerators. And it’s equipped with dual AMD EPYC processors, the customer’s choice of either the 9004 or 9005 series.

To handle some of the world’s most demanding AI workloads, Supermicro’s server is packed with an astonishing amount of tech. In addition to its eight GPUs, the server also has room for a pair of AMD EPYC 9005-series processors, 6TB of ECC DDR5 memory, and 18 hot-swap 2.5-inch NVMe and SATA drives.

That makes the Supermicro system one of the most capable and powerful servers now available. And as AI evolves, tech leaders including AMD and Supermicro will undoubtedly produce more powerful CPUs, GPUs and servers to meet the growing demand.

What will the next generation of AI training and inferencing technology look like? To find out, you won’t have to wait long.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

AMD’s new ROCm 6.3 makes GPU programming even better

Featured content

AMD’s new ROCm 6.3 makes GPU programming even better

AMD recently introduced version 6.3 of ROCm, its open software stack for GPU programming. New features included expanded OS support and other optimizations.

Learn More about this topic
  • Applications:
  • Featured Technologies:

There’s a new version of AMD ROCm, the open software stack designed to enable GPU programming from low-level kernel all the way up to end-user applications.  

The latest version, ROCm 6.3, adds features that include expanded operating system support, an open-source toolkit and more.

Rock On

AMD ROCm provides the tools for HIP (the heterogeneous-computing interface for portability), OpenCL and OpenMP. These include compilers, APIs, libraries for high-level functions, debuggers, profilers and runtimes.

ROCm is optimized for Generative AI and HPC applications, and it’s easy to migrate existing code into. Developers can use ROCm to fine-tune workloads, while partners and OEMs can integrate seamlessly with AMD to create innovative solutions.

The latest release builds on ROCm 6, which AMD introduced last year. Version 6 added expanded support for AMD Instinct MI300A and MI300X accelerators, key AI support features, optimized performance, and an expanded support ecosystem.

The senior VP of AMD’s AI group, Vamsi Boppana, wrote in a recent blog post: “Our vision is for AMD ROCm to be the industry’s premier open AI stack, enabling choice and rapid innovation.”

New Features

Here’s some of what’s new in AMD ROCm 6.3:

  • rocJPEG: A high-performance JPEG decode SDK for AMD GPUs.
  • ROCm compute profiler and system profiler: Previously known as Omniperf and Omnitrace, these have been renamed to reflect their new direction as part of the ROCm software stack.
  • Shark AI toolkit: This open-source toolkit is for high-performance serving of GenAI and  LLMs. Initial release includes support for the AMD Instinct MI300.
  • PyTorch 2.4 support: PyTorch is a machine learning library used for applications such as computer vision and natural language processing. Originally developed by Meta AI, it’s now part of the Linux Foundation umbrella.
  • Expanded OS support: This includes added support for Ubuntu 24.04.2 and 22.04.5; RHEL 9.5; and Oracle Linux 8.10. In addition, ROCm 6.3.1 includes support for both Debian 12 and the AMD Instinct MI325X accelerator.
  • Documentation updates: ROCm 6.3 offers clearer, more comprehensive guidance for a wider variety of use cases and user needs.

Super for Supermicro

Developers can use ROCm 6.3 to create tune workloads and create solutions for Supermicro GPU systems based on AMD Instinct MI300 accelerators.

Supermicro offers three such systems:

Are your customers building AI and HPC systems? Then tell them about the new features offered by AMD ROCm 6.3.

Do More:

 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

2024: A look back at the year’s best

Featured content

2024: A look back at the year’s best

Let's look back at 2024, a year when AI was everywhere, AMD introduced its 5th Gen EPYC processors, and Supermicro led with liquid cooling.

Learn More about this topic
  • Applications:
  • Featured Technologies:

You couldn't call 2024 boring.

If anything, the year was almost too exciting, too packed with important events, and moving much too fast.

Looking back, a handful of 2024’s technology events stand out. Here are a few of our favorite things.

AI Everywhere

In March AMD’s chief technology officer, Mark Papermaster, made some startling predictions that turned out to be absolutely true.

Speaking at an investors’ event sponsored by Arete Research, Papermaster said, “We’re thrilled to bring AI across our entire product portfolio.” AMD has indeed done that, offering AI capabilities from PCs to servers to high-performance GPU accelerators.

Papermaster also said the buildout of AI is an event as big as the launch of the internet. That certainly sounds right.

He also said AMD believes the total addressable market for AI through 2027 to be $400 billion. If anything, that was too conservative. More recently, consultants Bain & Co. predicted that figure will reach $780 billion to $990 billion.

Back in March, Papermaster said AMD had increased its projection for full-year AI sales from $2 billion to $3.5 billion. That’s probably too low, too.

AMD recently reported revenue of $3.5 billion for its data-center group for just the third quarter alone. The company attributed at least some of the group’s 122% year-on-year increase to the strong ramp of AMD Instinct GPU shipments.

5th Gen AMD EPYC Processors

October saw AMD introduce the fifth generation of its powerful line of EPYC server processors.

The 5th Gen AMD EPYC processors use the company’s new ‘Zen 5’ core architecture. It includes over 25 SKUs offering anywhere from 8 to 192 cores. And the line includes a model—the AMD EPYC 9575F—designed specifically to work with GPU-powered AI solutions.

The market has taken notice. During the October event, AMD CEO Lisa Su told the audience that nearly one in three servers worldwide (34%) are now powered by AMD EPYC processors. And Supermicro launched its new H14 line of servers that will use the new EPYC processors.

Supermicro Liquid Cooling

As servers gain power to add AI and other compute-intensive capabilities, they also run hotter. For data-center operators, that presents multiple challenges. One big one is cost: air conditioning is expensive. What’s more, AC may be unable to cool the new generation of servers.

Supermicro has a solution: liquid cooling. For some time, the company has offered liquid cooling as a data-center option.

In November the company took a new step in this direction. It announced a server that comes with liquid cooling only.

The server in question is the Supermicro 2U 4-node FlexTwin, model number AS -2126FT-HE-LCC. It’s a high-performance, hot-swappable, high-density compute system designed for HPC workloads.

Each 2U system comprises 4 nodes, and each node is powered by dual AMD EPYC 9005 processors. (The previous-gen AMD EPYC 9004s are supported, too.)

To keep cool, the FlexTwin server uses a direct-to-chip (D2C) cold plate liquid cooling setup. Each system also runs 16 counter-rotating fans. Supermicro says this cooling arrangement can remove up to 90% of server-generated heat.

AMD Instinct MI325X Accelerator

A big piece of AMD’s product portfolio for AI is its Instinct line of accelerators. This year the company promised to maintain a yearly cadence of new Instinct models.

Sure enough, in October the company introduced the AMD Instinct MI325X Accelerator. It’s designed for Generative AI performance and working with large language models (LLMs). The system offers 256GB of HBM3E memory and up to 6TB/sec. of memory bandwidth.

Looking ahead, AMD expects to formally introduce the line’s next member, the AMD Instinct MI350, in the second half of next year. AMD has said the new accelerator will be powered by a new AMD CDNA 4 architecture, and will improve AI inferencing performance by up to 35x compared with the older Instinct MI300.

Supermicro Edge Server

A lot of computing now happens at the edge, far beyond either the office or corporate data center.

Even more edge computing is on tap. Market watcher IDC predicts double-digit growth in edge-computing spending through 2028, when it believes worldwide sales will hit $378 billion.

Supermicro is on it. At the 2024 MWC, held in February in Barcelona, the company introduced an edge server designed for the kind of edge data centers run by telcos.

Known officially as the Supermicro A+ Server AS -1115SV-WTNRT, it’s a 1U short-depth server powered by a single AMD EPYC 8004 processor with up to 64 cores. That’s edgy.

Happy Holidays from all of us at Performance Intensive Computing. We look forward to serving you in 2025.

Check out these related blog posts:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Faster is better. Supermicro with 5th Gen AMD is faster

Featured content

Faster is better. Supermicro with 5th Gen AMD is faster

Supermicro servers powered by the latest AMD processors are up to 9 times faster than a previous generation, according to a recent benchmark.

Learn More about this topic
  • Applications:
  • Featured Technologies:

When it comes to servers, faster is just about always better.

With faster processors, workloads get completed in less time. End users get their questions answered sooner. Demanding high-performance computing (HPC) and AI applications run more smoothly. And multiple servers get all their jobs done more rapidly.

And if you’ve installed, set up or managed one of these faster systems, you’ll look pretty smart.

That’s why the latest benchmark results from Supermicro are so impressive, and also so important.

The tests show that Supermicro servers powered by the latest AMD processors are up to 9 times faster than a previous generation. These systems can make your customer happy—and make you look good.

SPEC Check

The benchmark in question are those of the Standard Performance Evaluation Corp., better known as SPEC. It’s a nonprofit consortium that sets benchmarks for running complete applications.

Supermicro ran its servers on SPEC’s CPU 2017 benchmark, a suite of 43 benchmarks that measures and compare compute-intensive performance. All of them stress a system’s CPU, memory subsystem and compiler—emphasizing all three of these components working together, not just the processor.

To provide a comparative measure of integer and floating-point compute-intensive performance, the benchmark uses two main metrics. The first is speed, or how much time a server needs to complete a single task. The second is throughput, in which the server runs multiple concurrent copies.

The results are given as comparative scores. In general, higher is better.

Super Server

The server tested was the Supermicro H14 Hyper server, model number AS 2126HS-TN. It’s powered by dual AMD EPYC 9965 processors and loaded with 1.5TB of memory.

This server has been designed for applications that include HPC, cloud computing, AI inferencing and machine learning.

In the floating-point measure, the new server, when compared with a SMC server powered by an earlier-gen AMD EPYC 7601, was 8x faster.

In the Integer Rate measure, compared with a circa 2018 SMC server, it’s almost 9x faster.

Impressive results. And remember, when it comes to servers, faster is better.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Pages