Performance Intensive Computing

Capture the full potential of IT

Tech Explainer: Green Computing, Part 3 – Why you should reduce, reuse & recycle

Featured content

Tech Explainer: Green Computing, Part 3 – Why you should reduce, reuse & recycle

The new 3Rs of green computing are reduce, reuse and recycle.

Applications:
Featured Technologies:

To help your customers meet their environmental, social and governance (ESG) goals, it pays to focus on the 3 Rs of green computing—reduce, reuse and recycle.

Sure, pursuing these goals can require some additional R&D and reorganization. But tech titans such as AMD and Supermicro are helping.

AMD, Supermicro and their vast supply chains are working to create a new virtuous circle. More efficient tech is being created using recycled materials, reused where possible, and then once again turned into recycled material.

For you and your customers, the path to green computing can lead to better corporate citizenship as well as higher efficiencies and lower costs.

Green server design

New disaggregated server technology is now available from manufacturers like Supermicro. This tech makes it possible for organizations of every size to increase their energy efficiency, better utilize data-center space, and reduce capital expenditures.

Supermicro’s SuperBlade, BigTwin and EDSFF SuperStorage are exemplars of disaggregated server design. The SuperBlade multi-node server, for instance, can house up to 20 server blades and 40 CPUs. And it’s available in 4U, 6U and 8U rack enclosures.

These efficient designs allow for larger, more efficient shared fans and power supplies. And along with the chassis itself, many elements can remain in service long past the lifespans of the silicon components they facilitate. In some cases, an updated server blade can be used in an existing chassis.

Remote reprogramming

Innovative technologies like adaptive computing enable organizations to adopt a holistic approach to green computing at the core, the edge and in end-user devices.

For instance, AMD’s adaptive computing initiative offers the ability to optimize hardware based on applications. Then your customers can get continuous updates after production deployment, adapting to new requirements without needing new hardware.

The key to adaptive computing is the Field Programmable Gate Array (FPGA). It’s essentially a blank canvas of hardware, capable of being configured into a multitude of different functions. Even after an FPGA has been deployed, engineers can remotely access the component to reprogram various hardware elements.

The FPGA reprogramming process can be as simple as applying security patches and bug fixes—or as complex as a wholesale change in core functionality. Either way, the green computing bona fides of adaptive computing are the same.

What’s more, adaptive tech like FPGAs significantly reduces e-waste. This helps to lower an organization’s overall carbon footprint by obviating the manufacturing and transportation necessary to replace hardware already deployed.

Adaptive computing also enables organizations to increase energy efficiency. Deploying cutting-edge tech like the AMD Instinct MI250X Accelerator to complete AI training or inferencing can significantly reduce the overall electricity needed to complete a task.

Radical recycling

Even in organizations with the best green computing initiatives, elements of the hardware infrastructure will eventually be ready for retirement. When the time comes, these organizations have yet another opportunity to go green—by properly recycling.

Some servers can be repurposed for other, less-demanding tasks, extending their lifespan. For example, a system that had been used for HPC applications that may no longer have the required FP64 performance could be repurposed to host a database or email application.

Quite a lot of today’s computer hardware can be recycled. This includes glass from monitors; plastic and aluminum from cases; copper in power supplies; precious metals used in circuitry; even the cardboard, wood and other materials used in packaging.

If that seems like too much work, there are now third-party organizations that will oversee your customers’ recycling efforts for a fee. Later, if all goes according to plan, these recycled materials will find their way back into the manufacturing supply chain.

Tech suppliers are working to make recycling even easier. For example, AMD is one of the many tech leaders whose commitment to environmental sustainability extends across its entire value chain. For AMD, that includes using environmentally preferable packing materials, such as recycled materials and non-toxic dyes.

Are you 3R?

Your customers understand that establishing and adhering to ESG goals is more than just a good idea. In fact, it’s vital to the survival of humanity.

Efforts like those of AMD and Supermicro are helping to establish a green computing revolution—and not a moment too soon.

In other words, pursuing green computing’s 3 Rs will be well worth the effort.

Also read:

Tech Explainer: Green Computing, Part 1 – What does the data center demand?

Tech Explainer: Green Computing, Part 2 – Holistic strategies

Implementing a green data center (Supermicro white paper)

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about Tech Explainer: Green Computing, Part 3 – Why you should reduce, reuse & recycle

Interview: How NEC Germany keeps up with the changing HPC market

Featured content

Interview: How NEC Germany keeps up with the changing HPC market

In an interview, Oliver Tennert, director of HPC marketing and post-sales at NEC Germany, explains how the company keeps pace with a fast-developing market.

Applications:
Featured Technologies:
Featured Companies:
NEC Germany

The market for high performance computing (HPC) is changing, meaning system integrators that serve HPC customers need to change too.

To learn more, PIC managing editor Peter Krass spoke recently with Oliver Tennert, NEC Germany’s director of HPC marketing and post-sales. NEC Germany works with hardware vendors that include AMD processors and Supermicro servers. This interview has been lightly edited for clarity.

First, please tell me about NEC Germany and its relationship with parent company NEC Corp.?

I work for NEC Germany, which is a subsidary of NEC Europe. Our parent company, NEC Corp., is a Japanese company with a focus on telecommunications, which is still a major part of our business. Today NEC has about 100,000 employees around the world.

HPC as a business within NEC is done primarily by NEC Germany and our counterparts at NEC Corp. in Japan. The Japanese operation covers HPC in Asia, and we cover EMEA, mainly Europe.

What kinds of HPC workloads and applications do your customers run?

It’s probably 60:40 — that is, about 60% of our customers are in academia, including universities, research facilities, and even DWD, Germany’s weather-forecasting service. The remaining 40% are industrial, including automotive and engineering companies.

The typical HPC use cases of our customers come in two categories. The most important HPC category of course is simulation. That can mean simulating physical processes. For example, what does a car crash look like under certain parameters? These simulations are done in great detail.

Our other important HPC category is data analytics. For example, that could mean genomic analysis.

How do you work with AMD and Supermicro?

To understand this, you first have to understand how NEC’s HPC business works. For us, there are two aspects to the business.

One, we’ve got our own vector technology. Our NEC vector engine is a PCIe card designed and produced in Japan. The latest incarnation of our vector supercomputer is the NEC SX-Aurora TSUBASA. It was designed to run applications that are both vectorizable and profit from high bandwidth to main memory. One of our big customers in this area is the German weather service, DWD.

The other part of the business is what we call “pizza boxes,” the x86 architecture. For this, we need industry-standard servers, including processors from AMD and servers from Supermicro.

For that second part of the business, what is NEC’s role?

The answer has to do with how the HPC business works operationally. If a customer intends to purchase a new HPC cluster, typically they need expert advice on designing an optimized HPC environment. What they do know is the application they run. And what they want to know is, ‘How do we get the best, most optimized system for this application?’

This implies doing a lot of configuration. Essentially, we optimize the design based on many different components. Even if we know that an AMD processor is the best for a particular task, still, there are dozens of combinations of processor SKUs and server model types which offer different price/performance ratios. The same applies to certain data-storage solutions. For HPC, storage is more than just picking an SSD. What’s needed is a completely different kind of technology.

Configuring and setting up such a complex solution takes a lot of expertise. We’re being asked to run benchmarks. That means the customer says, ‘Here’s my application, please run it on some specific configurations, and tell me which one offers the best price/performance ratio.’ This takes a lot of time and resources. For example, you need the systems on hand to just try it out. And the complete tender process—from pre-sales discussions to actual ordering and delivery—can take anywhere from weeks to months.

And this is just to bid, right? After all this work, you still might not get the order?

Yes, that can happen. There are lots of factors that influence your chances. In general, if you have a good working relationship with a private customer, it’s easier. They have more discretion than academic or public customers. For public bids, everything must be more transparent, because it’s more strictly regulated. Normally, that means you have more work, because you have to test more setups. Your competition will be doing the same.

When working with the second group, the private industry customers, do customer specify parts from specific vendors, such as AMD and Supermicro?

It depends on the factors that will influence the customer’s final selection. Price and performance, that’s one thing. Power consumption is another. Then, sometimes, it’s the vendors. Also, certain projects are more attractive to certain vendors because of market visibility—so-called lighthouse projects. That can have an influence on the conditions we get from vendors. Vendors also honor the amount of effort we have put in to getting the customer in the first place. So there are all sorts of external factors that can influence the final system design.

Also, today, the majority of HPC solutions are similar from an architectural point of view. So the difference between competing vendors is to take all the standard components and optimize from these, instead of providing a competing architecture. As a result, the soft skills—such as the ability to implement HPC solutions in an efficient and professional way—also have a large influence on the final order.

How about power consumption and cooling? Are these important considerations for your HPC customers?

It’s become absolutely vital. As a rule of thumb, we can say that the larger an HPC project is going to be, the more likely that it is going to be cooled by liquid.

In the past, you had a server room that you cooled with air conditioning. But those times are nearly gone. Today, when you think of a larger HPC installation—say, 1,000 or 2,000 nodes—you’re talking about a megawatt of power being consumed, or even more. And that also needs to be cooled.

The challenge in cooling a large environment is to get the heat away from the server and out of the room to somewhere else, whether outside or to a larger cooling system. This cannot be done by traditional cooling with air. Air is too inefficient for transporting heat. Water is much better. It’s a more efficient means for moving heat from Point A to Point B.

How are you cooling HPC systems with liquid?

There are a few ways to do this. There’s cold-water cooling, mainly indirect. You bring in water with what’s known as an “inlet temperature” of about 10 C and it cools down the air inside the server racks, with the heat getting carried away with the water now at about 15 or 20 C. The issue is, first you need energy just to cool the water down to 10 C. Also, there’s not much you can do with water at 15 or 20 C. It’s too warm for cooling anything else, but too cool for heating a room.

That’s why the new approach is to use hot-water cooling, mainly direct. It sounds like a paradox. But what might seem hot to a human being is in fact pretty cool for a CPU. For a CPU, an ambient temperature of 50 or 60 C is fine; it would be absolutely not fine for a human being. So if you have an inlet temperature for water of, say, 40 or 45 C, that will cool the CPU, which runs at an internal temperature of 80 or 90 C. The outbound temperature of the water is then maybe 50 C. Then it becomes interesting. At that temperature, you can heat a building. You can reuse the heat, rather than just throwing it away. So this kind of infrastructure is becoming more important and more interesting.

Looking ahead, what are some of your top projects for the future?

Public customers such as research universities have to replace their HPC systems every three to five years. That’s the normal cycle. In that time the hardware becomes obsolete, especially as the vendors optimize their power consumption to performance ratio more and more. So it’s a steady flow of new projects. For our industrial customers, the same applies, though the procurement cycle may vary.

We’re also starting to see the use of computational HPC capacity from the cloud. Normally, when people think of cloud, they think of public clouds from Amazon, Microsoft, etc. But for HPC, there are interim approaches as well. A decade ago, there was the idea of a dedicated public cloud. Essentially, this meant a dedicated capacity that was for the customer’s exclusive use, but was owned by someone other than the customer. Now, between the dedicated cloud and public cloud, there are all these shades of grey. In the past two years, we’ve implemented several larger installations of this “grey-shaded” cloud approach. So more and more, we’re entering the service-oriented market.

There is a larger trend away from customers wanting to own a system, and toward customers just wanting to utilize capacity. For vendors with expertise in HPC, they have to change as well. Which means a change in the business and the way they have to work with customers. It boils down to, Who owns the hardware? And what does the customer buy, hardware or just services? That doesn’t make you a public-cloud provider. It just means you take over responsibility for this particular customer environment. You have a different business model, contract type, and set of responsibilities.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about Interview: How NEC Germany keeps up with the changing HPC market

How AMD and Supermicro are working together to help you deliver AI

Featured content

How AMD and Supermicro are working together to help you deliver AI

AMD and Supermicro are jointly offering high-performance AI alternatives with superior price and performance.

Applications:
Featured Technologies:

When it comes to building AI systems for your customers, a certain GPU provider with a trillion-dollar valuation isn’t the only game in town. You should also consider the dynamic duo of AMD and Supermicro, which are jointly offering high-performance AI alternatives with superior price and performance.

Supermicro’s Universal GPU systems are designed specifically for large-scale AI and high-performance computing (HPC) applications. Some of these modular designs come equipped with AMD’s Instinct MI250 Accelerator and have the option of being powered by dual AMD EPYC processors.

AMD, with a newly formed AI group led by Victor Peng, is working hard to enable AI across many environments. The company has developed an open software stack for AI, and it has also expanded its partnerships with AI software and framework suppliers that now include the PyTorch Foundation and Hugging Face.

AI accelerators

In addition, AMD’s Instinct MI300A data-center accelerator is due to ship in this year’s fourth quarter. It’s the successor to AMD’s MI200 series, based on the company’s CDNA 2 architecture and first multi-die CPU, which powers some of today’s fastest supercomputers.

The forthcoming Instinct MI300A is based on AMD’s CDNA 3 architecture for AI and HPC workloads, which uses 5nm and 6nm process tech and advanced chiplet packaging. Under the MI300A’s hood, you’ll find 24 processor cores with Zen 4 tech, as well as 128GB of HBM3 memory that’s shared by the CPU and GPU. And it supports AMD ROCm 5, a production-ready, open source HPC and AI software stack.

Earlier this month, AMD introduced another member of the series, the AMD Instinct MI300X. It replaces three Zen 4 CPU chiplets with two CDNA 3 chiplets to create a GPU-only system. Announced at AMD’s recent Data Center and AI Technology Premier event, the MI300X is optimized for large language models (LLMs) and other forms of AI.

To accommodate the demanding memory needs of generative AI workloads, the new AMD Instinct MI300X also adds 64GB of HBM3 memory, for a new total of 192GB. This means the system can run large models directly in memory, reducing the number of GPUs needed, speeding performance, and reducing the user’s total cost of ownership (TCO).

AMD also recently introduced the AMD Instinct Platform, which puts eight MI300X systems and 1.5TB of memory in a standard Open Compute Project (OCP) infrastructure. It’s designed to drop into an end user’s current IT infrastructure with only minimal changes.

All this is coming soon. The AMD MI300A started sampling with select customers earlier this quarter. The MI300X and Instinct Platform are both set to begin sampling in the third quarter. Production of the hardware products is expected to ramp in the fourth quarter.

KT’s cloud

All that may sound good in theory, but how does the AMD + Supermicro combination work in the real world of AI?

Just ask KT Cloud, a South Korea-based provider of cloud services that include infrastructure, platform and software as a service (IaaS, PaaS, SaaS). With the rise of customer interest in AI, KT Cloud set out to develop new XaaS customer offerings around AI, while also developing its own in-house AI models.

However, as KT embarked on this AI journey, the company quickly encountered three major challenges:

The high cost of AI GPU accelerators: KT Cloud would need hundreds of thousands of new GPU servers.

Inefficient use of GPU resources in the cloud: Few cloud providers offer GPU virtualization due to overhead. As a result, most cloud-based GPUs are visible to only 1 virtual machine, meaning they cannot be shared by multiple users.

Difficulty using large GPU clusters: KT is training Korean-language models using literally billions of parameters, requiring more than 1,000 GPUs. But this is complex: Users would need to manually apply parallelization strategies and optimizations techniques.

The solution: KT worked with Moreh Inc., a South Korean developer of AI software, and AMD to design a novel platform architecture powered by AMD’s Instinct MI250 Accelerators and Moreh’s software.

The entire AI software stack was developed by Moreh from PyTorch and TensorFlow APIs to GPU-accelerated primitive operations. This overcomes the limitations of cloud services and large AI model training.

Users do not need to insert or modify even a single line of existing source code for the MoAI platform. They also do not need to change the method of running a PyTorch/TensorFlow program.

Did it work?

In a word, yes. To test the setup, KT developed a Korean language model with 11 billion parameters. Training was then done on two machines: one using Nvidia GPUs, the other being the AMD/Moreh cluster equipped with AMD Instinct MI250 accelerators, Supermicro Universal GPU systems, and the Moreh AI platform software.

Compared with the Nvidia system, the Moreh solution with AMD Instinct accelerators showed 116% throughput (as measured by tokens trained per second), and 2.05x higher cost-effectiveness (measured as throughput per dollar).

Other gains are expected, too. “With cost-effective AMD Instinct accelerators and a pay-as-you-go pricing model, KT Cloud expects to be able to reduce the effective price of its GPU cloud service by 70%,” says JooSung Kim, VP of KT Cloud.

Based on this test, KT built a larger AMD/Moreh cluster of 300 nodes—with a total of 1,200 AMD MI250 GPUs—to train the next version of the Korean language model with 200 billion parameters.

It delivers a theoretical peak performance of 434.5 petaflops for fp16/bf16 (a native 16-bit format for mixed-precision training) matrix operations. That should make it one of the top-tier GPU supercomputers in the world.

Do more:

Check out Supermicro Universal GPU systems

Explore AMD Instinct MI Series Accelerators

Read the case study: KT Cloud set to expand AI potential with AMD Instinct accelerators

Watch the video of AMD CEO Lisa Su’s presentation on new Instinct MI Series hardware (starts at 1:18:00)

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about How AMD and Supermicro are working together to help you deliver AI

Tech Explainer: Green Computing, Part 1 - What does the data center demand?

Featured content

Tech Explainer: Green Computing, Part 1 - What does the data center demand?

The ultimate goal of Green Computing is net-zero emissions. To get there, organizations can and must innovate, conducting an ongoing campaign to increase efficiency and reduce waste.

Applications:
Featured Technologies:

The Green Computing movement has begun in earnest and not a moment too soon. As humanity faces the existential threat of climate crisis, technology needs to be part of the solution. Green computing is a big step in the right direction.

The ultimate goal of Green Computing is net-zero emissions. It’s a symbiotic relationship between technology and nature in which both SMBs and enterprises can offset carbon emissions, drastically reduce pollution, and reuse/recycle the materials that make up their products and services.

To get there, the tech industry will need to first take a long, hard look at the energy it uses and the waste it produces. Using that information, individual organizations can and must innovate, conducting an ongoing campaign to increase efficiency and reduce waste.

It’s a lofty goal, sure. But after all the self-inflicted damage we’ve done since the dawn of the Industrial Revolution, we simply have no choice.

The data-center conundrum

All digital technology requires electricity to operate. But data centers use more than their share.

Here’s a startling fact: Each year, the world’s data centers gobble up at least 200 terawatts of energy. That’s roughly 2% of all the electricity used on this planet annually.

What’s more, that figure is likely to increase as new, power-hungry systems are brought online and new data centers are opened. And the number of global data centers could grow from 700 in 2021 to as many as 1,200 by 2026, predicts Supermicro.

At that rate, data-center energy consumption could account for up to 8% of global energy usage by 2030. That’s why tech leaders including AMD and Supermicro are rewriting the book on green computing best practices.

A Supermicro white paper, Green Computing: Top 10 Best Practices For A Green Data Center, suggests specific actions you and your customers can take now to reduce the environmental impact of your data centers:

Right-size systems to match workload requirements
Share common scalable infrastructure
Operate at higher ambient temperature
Capture heat at the source via aisle containment and liquid cooling
Optimize key components (i.e., CPU, GPU, SSD, etc.) for workload performance per watt
Optimize hardware refresh cycle to maintain efficiency
Optimize power delivery
Utilize virtualization and power management
Source renewable energy and green manufacturing
Consider climate impact when making site selection

Green components

Rethinking data-center architectures is an excellent way to leverage green computing from a macro perspective. But to truly make a difference, the industry needs to consider green computing at the component level.

This is one area where AMD is leading the charge. Its mission: increase the energy efficiency of its CPUs and hardware accelerators. The rest of the industry should follow suit.

In 2021 AMD announced its goal to deliver a 30x increase in energy efficiency for both AMD EPYC CPUs and AMD Instinct accelerators for AI and HPC applications running on accelerated compute nodes—and to do so by 2025.

Taming AI energy usage

The golden age of AI has begun. New machine learning algorithms will give life to a population of hyper-intelligent robots that will forever alter the nature of humanity. If AI’s most beneficent promises come to fruition, it could help us live, eat, travel, learn and heal far better than ever before.

But the news isn’t all good. AI has a dark side, too. Part of that dark side is its potential impact on our climate crisis.

Researchers at the University of Massachusetts, Amherst, illustrated this point by performing a life-cycle assessment for training several large AI models. Their findings, published by Supermicro, concluded that training a single AI model can emit more than 626,000 pounds of carbon dioxide. That’s approximately 5 times the lifetime emissions of your average American car.

A comparison like that helps put AMD’s environmental sustainability goals in perspective. Affecting a 30x energy efficiency increase in the components that power AI could bring some much-needed light to AI’s dark side.

In fact, if the whole technology sector produces practical innovations similar to those from AMD and Supermicro, we might have a fighting chance in the battle against climate crisis.

Continued…

Part 2 of this 3-part series will take a closer look at the technology behind green computing—and the world-saving innovations we could see soon.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about Tech Explainer: Green Computing, Part 1 - What does the data center demand?

AMD intros CPUs, cache, AI accelerators for cloud, enterprise data centers

Featured content

AMD intros CPUs, cache, AI accelerators for cloud, enterprise data centers

AMD strengthens its commitment to the cloud and enterprise data centers with new "Bergamo" CPUs, "Genoa-X" cache, Instinct accelerators.

Applications:
Featured Technologies:

This week AMD strengthened its already strong commitment to the cloud and enterprise markets. The company announced several new products and partnerships at its Data Center and AI Technology Premier event, which was held in San Francisco and simultaneously broadcast online.

“We’re focused on pushing the envelope in high-performance and adaptive computing,” AMD CEO Lisa Su told the audience, “creating solutions to the world’s most important challenges.”

Here’s what’s new:

Bergamo: That’s the former codename for the new 4th gen AMD EPYC 97X4 processors. AMD’s first processor designed specifically for cloud-native workloads, it packs up to 128 cores per socket using AMD’s new Zen 4c design to deliver lots of power/watt. Each socket contains 8 chiplets, each with up to 16 Zen 4c cores; that’s twice as many cores as AMD’s earlier Genoa processors (yet the two lines are compatible). The entire lineup is available now.

Genoa-X: Another codename, this one is for AMD’s new generation of AMD 3D V-Cache technology. This new product, designed specifically for technical computing such as engineering simulation, now supports over 1GB of L3 cache on a 96-core CPU. It’s paired with the new 4th gen AMD EPYC processor, including the high-performing Zen4 core, to deliver high performance/core.

“A larger cache feeds the CPU faster with complex data sets, and enables a new dimension of processor and workload optimization,” said Dan McNamara, an AMD senior VP and GM of its server business.

In all, there are 4 new Genoa-X SKUs, ranging from 16 to 96 cores, and all socket-compatible with AMD’s Genoa processors.

Genoa: Technically, not new, as this family of data-center CPUs was introduced last November. But what is new is AMD’s new focus for the processors on AI, data-center consolidation and energy efficiency.

AMD Instinct: Though AMD had already introduced its Instinct MI300 Series accelerator family, the company is now revealing more details.

This includes the introduction of the AMD Instinct MI300X, an advanced accelerator for generative AI based on AMD’s CDNA 3 accelerator architecture. It will support up to 192GB of HBM3 memory to provide the compute and memory efficiency needed for large language model (LLM) training and inference for generative AI workloads.

AMD also introduced the AMD Instinct Platform, which brings together eight MI300X accelerators into an industry-standard design for the ultimate solution for AI inference and training. The MI300X is sampling to key customers starting in Q3.

Finally, AMD also announced that the AMD Instinct MI300A, an APU accelerator for HPC and AI workloads, is now sampling to customers.

Partner news: Mark your calendar for June 20. That’s when Supermicro plans to explore key features and use cases for its Supermicro 13 systems based on AMD EPYC 9004 series processors. These Supermicro systems will feature AMD’s new Zen 4c architecture and 3D V-Cache tech.

This week Supermicro announced that its entire line of H13 AMD-based systems are now available with support for the 4th gen AMD EPYC processors with Zen 4c architecture and V-Cache technology.

That includes Supermicro’s new 1U and 2U Hyper-U servers designed for cloud-native workloads. Both are equipped with a single AMD EPYC processor with up to 128 cores.

Do more:

Watch the AMD “Data Center and AI Technology Premier” video

Learn more about 4th Gen AMD EPYC processors

Meet the new Supermicro Hyper-U servers

Explore Supermicro H13 servers with the new AMD CPUs

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about AMD intros CPUs, cache, AI accelerators for cloud, enterprise data centers

Why your AI systems can benefit from having both a GPU and CPU

Featured content

Why your AI systems can benefit from having both a GPU and CPU

Like a hockey team with players in different positions, an AI system with both a GPU and CPU is a necessary and winning combo. This mix of processors can bring you and your customers both the lower cost and greater energy efficiency of a CPU and the parallel processing power of a GPU. With this team approach, your customers should be able to handle any AI training and inference workloads that come their way.

Applications:
Featured Technologies:

Sports teams win with a range of skills and strengths. A hockey side can’t win if everyone’s playing goalie. The team also needs a center and wings to advance the puck and score goals, as well as defensive players to block the opposing team’s shots.

The same is true for artificial intelligence systems. Like a hockey team with players in different positions, an AI system with both a GPU and CPU is a necessary and winning combo.

This mix of processors can bring you and your customers both the lower cost and greater energy efficiency of a CPU and the parallel processing power of a GPU. With this team approach, your customers should be able to handle any AI training and inference workloads that come their way.

In the beginning

One issue: Neither CPUs nor GPUs were originally designed for AI. In fact, both designs predate AI by many years. Their origins still define how they’re best used, even for AI.

GPUs were initially designed for computer graphics, virtual reality and video. Getting pixels to the screen is a task where high levels of parallelization speed things up. And GPUs are good at parallel processing. This has allowed them to be adapted for HPC and AI workloads, which analyze and learn from large volumes of data. What’s more, GPUs are often used to run HPC and AI workloads simultaneously.

GPUs are also relatively expensive. For example, Nvidia’s new H100 has an estimated retail price of around $25,000 per GPU. Your customers may incur additional costs from cooling—GPUs generate a lot of heat. GPUs also use a lot of power, which can further raise your customer’s operating costs.

CPUs, by contrast, were originally designed to handle general-purpose computing. A modern CPU can run just about any type of calculation, thanks to its encompassing instruction set.

A CPU processes data sequentially, rather than in parallel, and that’s good for linear and complex calculations. Compared with GPUs, a comparable CPU generally is less expensive, needs less power and runs cooler.

In today’s cost-conscious environment, every data center manager is trying to get the most performance per dollar. Even a high-performing CPU has a cost advantage over comparable GPUs that can be extremely important for your customers.

Team players

Just as a hockey team doesn’t rely on its goalie to score points, smart AI practitioners know they can’t rely on their GPUs to do all types of processing. For some jobs, CPUs are still better.

Due to a CPU’s larger memory capacity, they’re ideal for machine learning training and inference, as long as the scale is relatively small. CPUs are also good for training small neural networks, data preparation and feature extraction.

CPUs offer other advantages, too. They’re generally less expensive than GPUs. In today’s cost-conscious environment, where every data center manager is trying to get the most performance per dollar, that’s extremely important. CPUs also run cooler than GPUs, requiring less (and less expensive) cooling.

GPUs excel in two main areas of AI: machine learning and deep learning (ML/DL). Both involve the analysis of gigabytes—or even terabytes—of data for image and video processing. For these jobs, the parallel processing capability of a GPU is a perfect match.

AI developers can also leverage a GPU’s parallel compute engines. They can do this by instructing the processor to partition complex problems into smaller, more manageable sub-problems. Then they can use libraries that are specially tuned to take advantage of high levels of parallelism.

Theory into practice

That’s the theory. Now let’s look at how some leading AI tech providers are putting the team approach of CPUs and GPUs into practice.

Supermicro offers its Universal GPU Systems, which combine Nvidia GPUs with CPUs from AMD, including the AMD EPYC 9004 Series.

An example is Supermicro’s H13 GPU server, with one model being the AS 8215GS-TNHR. It packs an Nvidia HGX H100 multi-GPU board, dual-socket AMD EPYC 9004 series CPU, and up to 6TB of DDR5 DRAM memory.

For truly large-scale AI projects, Supermicro offers SuperBlade systems designed for distributed, midrange AI and ML training. Large AI and ML workloads can require coordination among multiple independent servers, and the Supermicro SuperBlades are designed to do just that. Supermicro also offers rack-scale, plug-and-play AI solutions powered by the company’s GPUs and turbocharged with liquid cooling.

The Supermicro SuperBlade is available with a single AMD EYPC 7003/7002 series processors with up to 64 cores. You also get AMD 3D V-Cache, up to 2TB of system memory per node, and a 200Gbps InfiniBand HDR switch. Within a single 8U enclosure, you can install up to 20 blades.

Looking ahead, AMD plans to soon ship its Instinct MI300A, an integrated data-center accelerator that combines three key components: AMD Zen 4 CPUs, AMD CDNA3 GPUs, and high-bandwidth memory (HBM) chiplets. This new system is designed specifically for HPC and AI workloads.

Also, the AMD Instinct MI300A’s high data throughput lets the CPU and GPU work on the same data in memory simultaneously. AMD says this CPU-GPU partnership will help users save power, boost performance and simplify programming.

Truly, a team effort.

Do more:

Read a blog post: What is the AMD Instinct MI300A APU?

Read a solution brief: Supermicro SuperBlade powered by AMD EPYC processors excel at scaling distributed AI and ML training

Check out the tech specs on the Supermicro GPU A+ Server AS-8125GS-TNHR

Meet the AMD Instinct MI Series accelerators

Get training on AMD Arena: Supermicro SuperBlade Systems powered by AMD EPYC 7003 Series processors with AMD 3D V-CACHE technology

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about Why your AI systems can benefit from having both a GPU and CPU

How Generative AI is rocking the tech business—in a good way

Featured content

How Generative AI is rocking the tech business—in a good way

With ChatGPT the newest star of tech, generative AI has emerged as a major market opportunity for traditional hardware and software suppliers. Here’s some of what you can expect from AMD and Supermicro.

Applications:
Featured Technologies:

The seemingly overnight adoption of generative AI systems such as ChatGPT is transforming the tech industry.

A year ago, AI tech suppliers focused mainly on providing systems for training. For good reason: AI training is technically demanding.

But now the focus has shifted onto large language model (LLM) inferencing and generative AI.

Take ChatGPT, the AI chatbot built on a large language model. In just the first week after its launch, ChatGPT gained over a million users. Since then, it has attracted more than 100 million users who now generate some 10 million queries a day. OpenAI, ChatGPT’s developer, says the system has thus far processed approximately 300 billion words from over a million conversations.

It's not all fun and games, either. In a new Gartner poll of 2,500 executive leaders, nearly half the respondents said all the publicity around ChatGPT has prompted their organizations to increase their AI spending.

In the same survey, nearly 1 in 5 respondents already have generative AI in either pilot or production mode. And 7 in 10 are experimenting with or otherwise exploring the technology.

Top priority

This virtual explosion has gotten the attention of mainstream tech providers such as AMD. During the company’s recent first-quarter earnings call, CEO Lisa Su said, “We’re very excited about our opportunity in AI. This is our No. 1 strategic priority.”

And AMD is doing a lot more than just talking about AI. For one, the company has consolidated all its disparate AI activities into a single group that will be led by Victor Peng. He was previously general manager of AMD’s adaptive and embedded products group, which recently reported record first-quarter revenue of $1.6 billion, a year-on-year increase of 163%.

This new AI group will focus mainly on strengthening AMD’s AI software ecosystem. That will include optimized libraries, models and frameworks spanning all of the company’s compute engines.

Hardware for AI

AMD is also offering a wide range of AI hardware products for everything from mobile devices to powerful servers.

For data center customers, AMD’s most exciting hardware product is its Instinct MI300 Accelerator. Designed for both supercomputing HPC and AI workloads, the device is unusual in that it contains both a CPU and GPU. The MI300 is now being sampled with selected large customers, and general shipments are set to begin in this year’s second half.

Other AMD hardware components for AI include its “Genoa” EPYC processors for servers, Alveo accelerators for inference-optimized solutions, and embedded Versal AI Core series.

Several of AMD’s key partners are offering important AI products, too. That includes Supermicro. It now offers Universal GPU systems powered by AMD Instinct MI250 accelerator and optional EPYC CPUs.

These systems include the Supermicro AS 4124GQ-TNMI server. It’s powered by dual AMD EPYC 7003 Series processors and up to four AMD Instinct MI250 accelerators.

Help for AI developers

AMD has also made important moves on the developer front. Also during its Q1 earnings call, AMD announced expanded capabilities for developers to build robust AI solutions leveraging its products.

The moves include new updates to PyTorch 2.0. This open-source framework now offers native support for ROCm software and the latest TensorFlow-ZenDNN plug-in, which enables neural-network inferencing on AMD EPYC CPUs.

ROCm is an open software platform allowing researchers to tap the power of AMD Instinct accelerators to drive scientific discoveries. The latest version, ROCm 5.0, supports major machine learning (ML) frameworks, including TensorFlow and PyTorch. This helps users accelerate AI workloads.

TensorFlow is an end-to-end platform designed to make it easy to build and deploy ML models. And ZenDNN is a deep neural network library that includes basic APIs optimized for AMD CPU architectures.

Just the start

Busy as AMD and Supermicro have been with AI products, you should expect even more. As Gartner VP Francis Karamouzis says, “The generative AI frenzy shows no sign of abating.”

That sentiment gained support from AMD’s Su during the company’s Q1 earnings call.

“It’s a multiyear journey,” Su said in response to an analyst’s question about AI. “This is the beginning for what we think is a significant market opportunity for the next 3 to 5 years.”

Do more:

Learn how AI training works

Read up on the AMD Instinct MI300 Accelerator

Review AMD’s Q1:23 earnings (includes a link to the earnings webcast)

Check out Supermicro’s AI Universal GPU systems

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about How Generative AI is rocking the tech business—in a good way

Tech Explainer: How does Gaming as a Service work?

Featured content

Tech Explainer: How does Gaming as a Service work?

Gaming as a Service is a streaming platform that pushes content from the cloud to personal devices on demand. Though it’s been around for years, in some ways it’s just getting started.

Applications:
Featured Technologies:

The technology known as Gaming as a Service has been around for 20 years. But in many ways it’s just getting started.

The technology is already enjoyed by literally millions of gamers worldwide. But new advances in AI and edge computing are making a big difference. So are faster, more consistent internet connections.

And coming soon should be a mix of virtual and augmented reality (VR & AR) headsets. They could bring gaming to a whole new level.

But how does GaaS work? Let’s take a look.

Cloud + edge = GaaS

GaaS is to video games what Netflix is to movies. Like Netflix, GaaS is a streaming platform that pushes content from the cloud to PCs, smartphones and other personal devices (including gaming consoles with the appropriate updates) on demand.

GaaS originates in the cloud. There, data centers packed with powerful servers maintain the gaming environment, process user commands, determine interaction between players and the virtual world, and deliver real-time results to players.

If the cloud is GaaS’s brains, then edge computing networks are its arms. They reach out to a worldwide base of users, connecting their devices to the gaming cloud.

Edge devices also keep things speedy by amplifying or, if necessary, taking over various processing duties. This helps reduce latency, the time lag between when a command is issued and when it’s executed.

Latency is especially detrimental to gamers. They rely on split-second actions that can make the difference between winning and losing. For them, lower latency is always better.

Device choice

GaaS is innovative at the user end, too. GaaS can interface with a wide array of client devices. That offers gamers far more flexibility than they get with traditional gaming models.

With GaaS, users are no longer tied to a specific gaming PC or console such as the Microsoft Xbox or Sony PlayStation. Instead, gamers can use any supported device with a decent GPU and a stable internet connection speed of at least 10 to 15 Mbps.

To be sure, some GaaS games—one example is the super-popular Fortnite—require a mobile or desktop app. But these apps are usually free.

Other cloud-based games are designed to work with any standard web browser. This lets a gamer pick up wherever they left off, using nearly any internet-connected device anywhere in the world.

Big business

If all this sounds attractive, it is. One of the first GaaS titles, World of Warcraft, is still active nearly 20 years after its initial launch. In 2015—the last time its publisher, Blizzard Entertainment, reported usage numbers—World of Warcraft had 5.5 million players.

Even more popular is Fortnite, introduced in 2017. Today it has more than 350 million registered users. In part, that’s because of the game’s flexible business model: Fortnite players can sign up and enjoy basic gameplay for free.

Instead of charging these users a fee, Fortnite’s developer, Epic Games, makes money from literally millions of micro-transactions. These include in-game purchases of weapons and accessories, access to tournaments and other gated experiences, and the purchase of a new “season,” released four times a year.

Super-popular games like Fortnite and World of Warcraft have help create a lucrative and compelling business model. This, in turn, has given rise to a new breed of GaaS tech providers.

One such operation is Blacknut, a France-based cloud gaming platform. Together with Australian outfit Radian Arc, Blacknut provides a GaaS digital infrastructure powered by AMD-based GPU servers designed and distributed by Supermicro.

What could go wrong?

Does GaaS have a downside? Sure. No platform is without its flaws.

For one, cloud gamers are at the mercy of the cloud. If a cloud provider experiences a slowdown or outage, a game can disappear until the issue is resolved.

For another, unlike a collection of game titles on physical media, GaaS gamers never really own the games they play. For example, if Epic decided to shut down Fortnite tomorrow, 350+ million gamers would have no choice but to look for alternate entertainment.

Internet access can be an issue, too. Those of us in first-world cities tend to take our high-speed connections for granted. The rest of the world may not be so lucky.

Future of GaaS

Looking ahead, the future of GaaS appears bright.

Advances in AI-powered cloud and edge computing will encourage game developers to create more nuanced and immersive content than ever before.

Faster and more consistent internet connections will help. They’ll give more power to both the bandwidth-hungry devices we use today and the shiny, new objects of desire we’ll clamor for tomorrow.

Tomorrow’s devices will surely include a mixture of VR and AR headsets. These could attach to other smart devices that enhance gameplay, like the interactive bodysuits foretold by movies such as Ready Player One.

GaaS will get smaller, too, as new mobile devices come to the market. Cloud-gaming titles, already a mainstay of mobile gamers, should be further empowered by next-generation mobile processors and faster, more reliable wireless data connections like 5G.

We’re witnessing the evolution of gaming as multiple clients interact with low latencies and high-quality graphics. Welcome to the future.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about Tech Explainer: How does Gaming as a Service work?

What is the AMD Instinct MI300A APU?

Featured content

What is the AMD Instinct MI300A APU?

Accelerate HPC and AI workloads with the combined power of CPU and GPU compute.

Applications:
Featured Technologies:

The AMD Instinct MI300A APU, set to ship in this year’s second half, combines the compute power of a CPU with the capabilities of a GPU. Your data-center customers should be interested if they run high-performance computing (HPC) or AI workloads.

More specifically, the AMD Instinct MI300A is an integrated data-center accelerator that combines AMD Zen 4 cores, AMD CDNA3 GPUs and high-bandwidth memory (HBM) chiplets. In all, it has more than 146 billion transistors.

This AMD component uses 3D die stacking to enable extremely high bandwidth among its parts. In fact, nine 5nm chiplets that are 3D-stacked on top of four 6nm chiplets with significant HBM surrounding it.

And it’s coming soon. The AMD Instinct MI300A is currently in AMD’s labs. It will soon be sampled with customers. And AMD says it’s scheduled for shipments in the second half of this year.

‘Most complex chip’

The AMD Instinct MI300A was publicly displayed for the first time earlier this year, when AMD CEO Lisa Su held up a sample of the component during her CES 2023 keynote. “This is actually the most complex chip we’ve ever built,” Su told the audience.

A few tech blogs have gotten their hands on early samples. One of them, Tom’s Hardware, was impressed by the “incredible data throughput” among the Instinct MI300A’s CPU, GPU and memory dies.

The Tom’s Hardware reviewer added that will let the CPU and GPU work on the same data in memory simultaneously, saving power, boosting performance and simplifying programming.

Another blogger, Karl Freund, a former AMD engineer who now works as a market researcher, wrote in a recent Forbes blog post that the Instinct MI300 is a “monster device” (in a good way). He also congratulated AMD for “leading the entire industry in embracing chiplet-based architectures.”

Previous generation

The new AMD accelerator builds on a previous generation, the AMD Instinct MI200 Series. It’s now used in a variety of systems, including Supermicro’s A+ Server 4124GQ-TNMI. This completely assembled system supports the AMD Instinct MI250 OAM (OCP Acceleration Module) accelerator and AMD Infinity Fabric technology.

The AMD Instinct MI200 accelerators are designed with the company’s 2nd gen AMD CDNA Architecture, which encompasses the AMD Infinity Architecture and Infinity Fabric. Together, they offer an advanced platform for tightly connected GPU systems, empowering workloads to share data fast and efficiently.

The MI200 series offers P2P connectivity with up to 8 intelligent 3rd Gen AMD Infinity Fabric Links with up to 800 GB/sec. of peak total theoretical I/O bandwidth. That’s 2.4x the GPU P2P theoretical bandwidth of the previous generation.

Supercomputing power

The same kind of performance now available to commercial users of the AMD-Supermicro system is also being applied to scientific supercomputers.

The AMD Instinct MI25X accelerator is now used in the Frontier supercomputer built by the U.S. Dept. of Energy. That system’s peak performance is rated at 1.6 exaflops—or over a billion billion floating-point operations per second.

The AMD Instinct MI250X accelerator provides Frontier with flexible, high-performance compute engines, high-bandwidth memory, and scalable fabric and communications technologies.

Looking ahead, the AMD Instinct MI300A APU will be used in Frontier’s successor, known as El Capitan. Scheduled for installation late this year, this supercomputer is expected to deliver at least 2 exaflops of peak performance.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about What is the AMD Instinct MI300A APU?

AMD and Supermicro Sponsor Two Fastest Linpack Scores at SC22’s Student Cluster Competition

Featured content

AMD and Supermicro Sponsor Two Fastest Linpack Scores at SC22’s Student Cluster Competition

The Student Cluster Computing challenge made its 16th appearance at the SuperComputer 22 (SC22) event in Dallas. The two student teams that were running AMD EPYC™ CPUs and AMD Instinct™ GPUs were the two teams that aced the Linpack benchmark. That's the test used to determined the TOP500 supercomputers in the world.

Applications:
Featured Technologies:

Last month, the annual Supercomputing Conference 2022 (SC22) was held in Dallas. The Student Cluster Competition (SCC), which began in 2007, was also performed again. The SCC offers an immersive high-performance computing (HPC) experience to undergraduate and high school students.

According to the SC22 website: Student teams design and build small clusters, learn scientific applications, apply optimization techniques for their chosen architectures and compete in a non-stop, 48-hour challenge at the SC conference to complete real-world scientific workloads, showing off their HPC knowledge for conference attendees and judges.

Each team has six students, at least one faculty advisor, a sutdent team leader, and is associated with vendor sponsors, which provide the equipment. AMD and Supermicro jointly sponsored both the Massachusetts Green Team from MIT, Boston University and Northeastern University and the 2MuchCache team from UC San Diego (UCSD) and the San Diego Supercomputer Center (SDSC). Running AMD EPYC™ CPUs and AMD Instinct™-based GPUs supplied by AMD and Supermicro, the two teams came in first and second in the SCC Linpack test.

The Linpack benchmarks measure a system's floating-point computing power, according to Wikipedia. The latest version of these benchmarks is used to determine the TOP500 list, ranks the world's most powerful supercomputers.

In addition to chasing high scores on benchmarks, the teams must operate their systems without exceeding a power limit. For 2022, the competition used a variable power limit: at times, the power available to each team for its competition hardware was as high as 4000-watts (but was usually lower) and at times it was as low as 1500-watts (but was usually higher).

The “2MuchCache” team offers a poster page with extensive detail about their competition hardware. They used two third-generation AMD EPYC™ 7773X CPUs with 64 cores, 128 threads and 768MB of stacked-die cache. Team 2MuchCache used one AS-4124GQ-TNMI system with four AMD Instinct™ MI250 GPUs with 53 simultaneous threads.

The “Green Team’s” poster page also boasts two instances of third-generation AMD 7003-series EPYC™ processors, AMD Instinct™ 1210 GPUs with AMD Infinity fabric. The Green Team utilized two Supermicro AS-4124GS-TNR GPU systems.

The Students of 2MuchCache:

Longtian Bao, role: Lead for Data Centric Python, Co-lead for HPCG

Stefanie Dao, role: Lead for PHASTA, Co-lead for HPL

Michael Granado, role: Lead for HPCG, Co-lead for PHASTA

Yuchen Jing, role: Lead for IO500, Co-lead for Data Centric Python

Davit Margarian, role: Lead for HPL, Co-lead for LAMMPS

Matthew Mikhailov Major, role: Team Lead, Lead for LAMMPS, Co-lead for IO500

The Students of Green Team:

Po Hao Chen, roles: Team leader, theory & HPC, benchmarks, reproducibility

Carlton Knox, roles: Computer Arch., Benchmarks, Hardware

Andrew Nguyen, roles: Compilers & OS, GPUs, LAMMPS, Hardware

Vance Raiti, roles: Mathematics, Computer Arch., PHASTA

Yida Wang, roles: ML & HPC, Reproducibility

Yiran Yin, roles: Mathematics, HPC, PHASTA

Congratulations to both teams!

Featured videos

Events

Find AMD & Supermicro Elsewhere

Read more about AMD and Supermicro Sponsor Two Fastest Linpack Scores at SC22’s Student Cluster Competition

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Pages