Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Interview: How NEC Germany keeps up with the changing HPC market

Featured content

Interview: How NEC Germany keeps up with the changing HPC market

In an interview, Oliver Tennert, director of HPC marketing and post-sales at NEC Germany, explains how the company keeps pace with a fast-developing market.

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • NEC Germany

The market for high performance computing (HPC) is changing, meaning system integrators that serve HPC customers need to change too.

To learn more, PIC managing editor Peter Krass spoke recently with Oliver Tennert, NEC Germany’s director of HPC marketing and post-sales. NEC Germany works with hardware vendors that include AMD processors and Supermicro servers. This interview has been lightly edited for clarity.

First, please tell me about NEC Germany and its relationship with parent company NEC Corp.?

I work for NEC Germany, which is a subsidary of NEC Europe. Our parent company, NEC Corp., is a Japanese company with a focus on telecommunications, which is still a major part of our business. Today NEC has about 100,000 employees around the world.

HPC as a business within NEC is done primarily by NEC Germany and our counterparts at NEC Corp. in Japan. The Japanese operation covers HPC in Asia, and we cover EMEA, mainly Europe.

What kinds of HPC workloads and applications do your customers run?

It’s probably 60:40 — that is, about 60% of our customers are in academia, including universities, research facilities, and even DWD, Germany’s weather-forecasting service. The remaining 40% are industrial, including automotive and engineering companies. 

The typical HPC use cases of our customers come in two categories. The most important HPC category of course is simulation. That can mean simulating physical processes. For example, what does a car crash look like under certain parameters? These simulations are done in great detail.

Our other important HPC category is data analytics. For example, that could mean genomic analysis.

How do you work with AMD and Supermicro?

To understand this, you first have to understand how NEC’s HPC business works. For us, there are two aspects to the business.

One, we’ve got our own vector technology. Our NEC vector engine is a PCIe card designed and produced in Japan. The latest incarnation of our vector supercomputer is the NEC SX-Aurora TSUBASA. It was designed to run applications that are both vectorizable and profit from high bandwidth to main memory. One of our big customers in this area is the German weather service, DWD.

The other part of the business is what we call “pizza boxes,” the x86 architecture. For this, we need industry-standard servers, including processors from AMD and servers from Supermicro.

For that second part of the business, what is NEC’s role?

The answer has to do with how the HPC business works operationally. If a customer intends to purchase a new HPC cluster, typically they need expert advice on designing an optimized HPC environment. What they do know is the application they run. And what they want to know is, ‘How do we get the best, most optimized system for this application?’

This implies doing a lot of configuration. Essentially, we optimize the design based on many different components. Even if we know that an AMD processor is the best for a particular task, still, there are dozens of combinations of processor SKUs and server model types which offer different price/performance ratios. The same applies to certain data-storage solutions. For HPC, storage is more than just picking an SSD. What’s needed is a completely different kind of technology.

Configuring and setting up such a complex solution takes a lot of expertise. We’re being asked to run benchmarks. That means the customer says, ‘Here’s my application, please run it on some specific configurations, and tell me which one offers the best price/performance ratio.’ This takes a lot of time and resources. For example, you need the systems on hand to just try it out. And the complete tender process—from pre-sales discussions to actual ordering and delivery—can take anywhere from weeks to months.

And this is just to bid, right? After all this work, you still might not get the order?

Yes, that can happen. There are lots of factors that influence your chances. In general, if you have a good working relationship with a private customer, it’s easier. They have more discretion than academic or public customers. For public bids, everything must be more transparent, because it’s more strictly regulated. Normally, that means you have more work, because you have to test more setups. Your competition will be doing the same.

When working with the second group, the private industry customers, do customer specify parts from specific vendors, such as AMD and Supermicro?

It depends on the factors that will influence the customer’s final selection. Price and performance, that’s one thing. Power consumption is another. Then, sometimes, it’s the vendors. Also, certain projects are more attractive to certain vendors because of market visibility—so-called lighthouse projects. That can have an influence on the conditions we get from vendors. Vendors also honor the amount of effort we have put in to getting the customer in the first place. So there are all sorts of external factors that can influence the final system design.

Also, today, the majority of HPC solutions are similar from an architectural point of view. So the difference between competing vendors is to take all the standard components and optimize from these, instead of providing a competing architecture. As a result, the soft skills—such as the ability to implement HPC solutions in an efficient and professional way—also have a large influence on the final order.

How about power consumption and cooling? Are these important considerations for your HPC customers?

It’s become absolutely vital. As a rule of thumb, we can say that the larger an HPC project is going to be, the more likely that it is going to be cooled by liquid.

In the past, you had a server room that you cooled with air conditioning. But those times are nearly gone. Today, when you think of a larger HPC installation—say, 1,000 or 2,000 nodes—you’re talking about a megawatt of power being consumed, or even more. And that also needs to be cooled.

The challenge in cooling a large environment is to get the heat away from the server and out of the room to somewhere else, whether outside or to a larger cooling system. This cannot be done by traditional cooling with air. Air is too inefficient for transporting heat. Water is much better. It’s a more efficient means for moving heat from Point A to Point B.

How are you cooling HPC systems with liquid?

There are a few ways to do this. There’s cold-water cooling, mainly indirect. You bring in water with what’s known as an “inlet temperature” of about 10 C and it cools down the air inside the server racks, with the heat getting carried away with the water now at about 15 or 20 C. The issue is, first you need energy just to cool the water down to 10 C. Also, there’s not much you can do with water at 15 or 20 C. It’s too warm for cooling anything else, but too cool for heating a room.

That’s why the new approach is to use hot-water cooling, mainly direct. It sounds like a paradox. But what might seem hot to a human being is in fact pretty cool for a CPU. For a CPU, an ambient temperature of 50 or 60 C is fine; it would be absolutely not fine for a human being. So if you have an inlet temperature for water of, say, 40 or 45 C, that will cool the CPU, which runs at an internal temperature of 80 or 90 C. The outbound temperature of the water is then maybe 50 C. Then it becomes interesting. At that temperature, you can heat a building. You can reuse the heat, rather than just throwing it away. So this kind of infrastructure is becoming more important and more interesting.

Looking ahead, what are some of your top projects for the future?

Public customers such as research universities have to replace their HPC systems every three to five years. That’s the normal cycle. In that time the hardware becomes obsolete, especially as the vendors optimize their power consumption to performance ratio more and more. So it’s a steady flow of new projects. For our industrial customers, the same applies, though the procurement cycle may vary.

We’re also starting to see the use of computational HPC capacity from the cloud. Normally, when people think of cloud, they think of public clouds from Amazon, Microsoft, etc. But for HPC, there are interim approaches as well. A decade ago, there was the idea of a dedicated public cloud. Essentially, this meant a dedicated capacity that was for the customer’s exclusive use, but was owned by someone other than the customer. Now, between the dedicated cloud and public cloud, there are all these shades of grey. In the past two years, we’ve implemented several larger installations of this “grey-shaded” cloud approach. So more and more, we’re entering the service-oriented market.

There is a larger trend away from customers wanting to own a system, and toward customers just wanting to utilize capacity. For vendors with expertise in HPC, they have to change as well. Which means a change in the business and the way they have to work with customers. It boils down to, Who owns the hardware? And what does the customer buy, hardware or just services? That doesn’t make you a public-cloud provider. It just means you take over responsibility for this particular customer environment. You have a different business model, contract type, and set of responsibilities.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Interview: How German system integrator SVA serves high performance computing with AMD and Supermicro

Featured content

Interview: How German system integrator SVA serves high performance computing with AMD and Supermicro

In an interview, Bernhard Homoelle, head of the HPC competence center at German system integrator SVA, explains how his company serves customers with help from AMD and Supermicro. 

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • SVA System Vertrieb Alexander GmbH

SVA System Vertrieb Alexander GmbH, better known as SVA, is among the leading IT system integrators of Germany. Headquartered in Wiesbaden, the company employs more than 2,700 people in 27 branch offices. SVA’s customers include organizations in automotive, financial services and healthcare.

To learn more about how SVA works jointly with Supermicro and AMD on advanced technologies, PIC managing editor Peter Krass spoke recently with Bernhard Homoelle, head of SVA’s high performance computing (HPC) competence center (pictured above). Their interview has been lightly edited.

For readers outside of Germany, please tell us about SVA?

First of all, SVA is an owner-operated system integrator. We offer high-quality products, we sell infrastructure, we support certain types of implementations, and we offer operational support to help our customers achieve optimum solutions.

We work with partners to figure out what might be the best solution for our customers, rather than just picking one vendor and trying to convince the customer they should use them. Instead, we figure out what is really needed. Then we go in the direction where the customer can really have their requirements met. The result is a good relationship with the customer, even after a particular deal has been closed.

Does SVA focus on specific industries?

While we do support almost all the big industries—automotive, transportation, public sector, healthcare and more—we are not restricted to any specific vertical. Our main business is helping customers solve their daily IT problems, deal with the complexity of new IT systems, and implement new things like AI and even quantum computing. So we’re open to new solutions. We also offer training with some of our partners.

Germany has a robust auto industry. How do you work with these clients?

In general, they need huge HPC clusters and machine learning. For example, autonomous driving demands not only more computing power, but also more storage. We’re talking about petabytes of data, rather than terabytes. And this huge amount of data needs to be stored somewhere and finally processed. That puts pressure on the infrastructure—not just on storage, but also on the network infrastructure as well as on the compute side. For their way into cloud, some these customers are saying, “Okay, offer me HPC as a Service.”

How do you work with AMD and Supermicro?

It’s a really good relationship. We like working with them because Supermicro has all these various types of servers for individual needs. Customers are different, and therefore they have their own requirements. Figuring out what might be the best server for them is difficult if you have limited types of servers available. But with Supermicro, you can get what you have in mind. You don’t have to look for special implementations because they have these already at hand.

We’re also partnering with AMD, and we have access to their benchmark labs, so we can get very helpful information. We start with discussions with the customer to figure out their needs. Typically, we pick up an application from the customer and then use it as a kind of benchmark. Next, we put it on a cluster with different memory, different CPUs, and look for the best solution in terms of performance for their particular application. Based on the findings, we can recommend a specific CPU, number of cores, memory type and size, and more.

With HPC applications, core memory bandwidth is almost as important as the number of cores. AMD’s new Genoa-X processors should help to overcome some of these limitations. And looking ahead, I’m keen to see what AMD will offer with the Instinct MI300.

Are there special customer challenges you’re solving with Supermicro and AMD solutions?

With HPC workloads, our academic customers say, “This is the amount of money available, so how many servers can you really give us for this budget?” Supermicro and AMD really help here with reasonable prices. They’re a good choice for price/performance.

With AI and machine learning, the real issue is software tools. It really depends what kinds of models you can use and how easy it is to use the hardware with those models.

This discussion is not easy, because for many of our customers today, AI means Nvidia. But I really recommend alternatives, and AMD is bringing some alternatives that are great. They offer a fast time to solution, but they also need to be easy to switch to.

How about "green" computing? Is this an important issue for your customers now?

Yes, more and more we’re seeing customers ask for this green computing approach. Typically, a customer has a thermal budget and a power-price budget. They may say, “In five years, the expenses paid for power should not exceed a certain limit.”

In Europe, we also have a supply-chain discussion. Vendors must increasingly provide proof that they’re taking care in their supply chain with issues including child labor and working conditions. This is almost mandatory, especially in government calls. If you’re unable to answer these questions, you’re out of the bid.

With green computing, we see that the power needed for CPUs and GPUs is going up and up. Five years ago, the maximum a CPU could burn was 200W, but now even 400W might not be enough. Some GPUs are as high as 700W, and there are super-chips beyond even that.

All this makes it difficult to use air-cooled systems. Customers can use air conditioning to a certain extent, but there’s only so much air you can press through the rack. Then you need either on-chip water cooling or some kind of immersion cooling. This can help in two dimensions: saving energy and getting density — you can put the components closer together, and you don’t need the big heat sink anymore.

One issue now is that each vendor offers a different cooling infrastructure. Some of our customers run multi-vendor data centers, so this could create a compatibility issue. That’s one reason we’re looking into immersion cooling. We think we could do some of our first customer implementations in 2024.

Looking ahead, what do you see as a big challenge?

One area is that we want to help customers get easier access to their HPC clusters. That’s done on the software side.

In contrast to classic HPC users, machine learning and AI engineers are not that interested in Linux stuff, compiler options or any other infrastructure details. Instead, they’d like to work on their frameworks. The challenge is getting them to their work as easily as possible—so that they can just log in, and they’re in their development environment. That way, they won’t have to care about what sort of operating system is underneath or what kind of scheduler, etc., is running.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

How AMD and Supermicro are working together to help you deliver AI

Featured content

How AMD and Supermicro are working together to help you deliver AI

AMD and Supermicro are jointly offering high-performance AI alternatives with superior price and performance.

Learn More about this topic
  • Applications:
  • Featured Technologies:

When it comes to building AI systems for your customers, a certain GPU provider with a trillion-dollar valuation isn’t the only game in town. You should also consider the dynamic duo of AMD and Supermicro, which are jointly offering high-performance AI alternatives with superior price and performance.

Supermicro’s Universal GPU systems are designed specifically for large-scale AI and high-performance computing (HPC) applications. Some of these modular designs come equipped with AMD’s Instinct MI250 Accelerator and have the option of being powered by dual AMD EPYC processors.

AMD, with a newly formed AI group led by Victor Peng, is working hard to enable AI across many environments. The company has developed an open software stack for AI, and it has also expanded its partnerships with AI software and framework suppliers that now include the PyTorch Foundation and Hugging Face.

AI accelerators

In addition, AMD’s Instinct MI300A data-center accelerator is due to ship in this year’s fourth quarter. It’s the successor to AMD’s MI200 series, based on the company’s CDNA 2 architecture and first multi-die CPU, which powers some of today’s fastest supercomputers.

The forthcoming Instinct MI300A is based on AMD’s CDNA 3 architecture for AI and HPC workloads, which uses 5nm and 6nm process tech and advanced chiplet packaging. Under the MI300A’s hood, you’ll find 24 processor cores with Zen 4 tech, as well as 128GB of HBM3 memory that’s shared by the CPU and GPU. And it supports AMD ROCm 5, a production-ready, open source HPC and AI software stack.

Earlier this month, AMD introduced another member of the series, the AMD Instinct MI300X. It replaces three Zen 4 CPU chiplets with two CDNA 3 chiplets to create a GPU-only system. Announced at AMD’s recent Data Center and AI Technology Premier event, the MI300X is optimized for large language models (LLMs) and other forms of AI.

To accommodate the demanding memory needs of generative AI workloads, the new AMD Instinct MI300X also adds 64GB of HBM3 memory, for a new total of 192GB. This means the system can run large models directly in memory, reducing the number of GPUs needed, speeding performance, and reducing the user’s total cost of ownership (TCO).

AMD also recently introduced the AMD Instinct Platform, which puts eight MI300X systems and 1.5TB of memory in a standard Open Compute Project (OCP) infrastructure. It’s designed to drop into an end user’s current IT infrastructure with only minimal changes.

All this is coming soon. The AMD MI300A started sampling with select customers earlier this quarter. The MI300X and Instinct Platform are both set to begin sampling in the third quarter. Production of the hardware products is expected to ramp in the fourth quarter.

KT’s cloud

All that may sound good in theory, but how does the AMD + Supermicro combination work in the real world of AI?

Just ask KT Cloud, a South Korea-based provider of cloud services that include infrastructure, platform and software as a service (IaaS, PaaS, SaaS). With the rise of customer interest in AI, KT Cloud set out to develop new XaaS customer offerings around AI, while also developing its own in-house AI models.

However, as KT embarked on this AI journey, the company quickly encountered three major challenges:

  • The high cost of AI GPU accelerators: KT Cloud would need hundreds of thousands of new GPU servers.
  • Inefficient use of GPU resources in the cloud: Few cloud providers offer GPU virtualization due to overhead. As a result, most cloud-based GPUs are visible to only 1 virtual machine, meaning they cannot be shared by multiple users.
  • Difficulty using large GPU clusters: KT is training Korean-language models using literally billions of parameters, requiring more than 1,000 GPUs. But this is complex: Users would need to manually apply parallelization strategies and optimizations techniques.

The solution: KT worked with Moreh Inc., a South Korean developer of AI software, and AMD to design a novel platform architecture powered by AMD’s Instinct MI250 Accelerators and Moreh’s software.

The entire AI software stack was developed by Moreh from PyTorch and TensorFlow APIs to GPU-accelerated primitive operations. This overcomes the limitations of cloud services and large AI model training.

Users do not need to insert or modify even a single line of existing source code for the MoAI platform. They also do not need to change the method of running a PyTorch/TensorFlow program.

Did it work?

In a word, yes. To test the setup, KT developed a Korean language model with 11 billion parameters. Training was then done on two machines: one using Nvidia GPUs, the other being the AMD/Moreh cluster equipped with AMD Instinct MI250 accelerators, Supermicro Universal GPU systems, and the Moreh AI platform software.

Compared with the Nvidia system, the Moreh solution with AMD Instinct accelerators showed 116% throughput (as measured by tokens trained per second), and 2.05x higher cost-effectiveness (measured as throughput per dollar).

Other gains are expected, too. “With cost-effective AMD Instinct accelerators and a pay-as-you-go pricing model, KT Cloud expects to be able to reduce the effective price of its GPU cloud service by 70%,” says JooSung Kim, VP of KT Cloud.

Based on this test, KT built a larger AMD/Moreh cluster of 300 nodes—with a total of 1,200 AMD MI250 GPUs—to train the next version of the Korean language model with 200 billion parameters.

It delivers a theoretical peak performance of 434.5 petaflops for fp16/bf16 (a native 16-bit format for mixed-precision training) matrix operations. That should make it one of the top-tier GPU supercomputers in the world.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Genoa-X: a deeper dive into AMD’s new EPYC processors optimized for technical computing

Featured content

Genoa-X: a deeper dive into AMD’s new EPYC processors optimized for technical computing

AMD has introduced its EPYC 9X84X series processors, formerly codenamed Genoa-X. The new CPUs are designed specifically for technical workloads, and they support up to 1.1GB of L3 Cache.

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD is responding to greater specialization in the data center by creating workload-optimized versions of its 4th gen EPYC server processors.

That now includes the AMD EPYC 9x84X series processors, formerly codenamed Genoa-X.

These new CPUs are optimized for technical computing workloads. Those include engineering simulation, product design, structural design, aerodynamics modeling and electronic design automation (EDA).

Big cache

A key feature of the new AMD EPYC 9x84X processors is the new 2nd generation of AMD’s 3D V-Cache technology. It supports more than 1GB of L3 Cache on a 96-core CPU. The larger cache can feed the CPU faster with data needed for large and complex simulations.

Speaking at AMD’s Data Center and AI Technology Premier earlier this month, Dan McNamara, GM of AMD’s server business, said this will deliver a “new dimension” of workload optimization. This will help users get to market faster with higher-quality products while also reducing their OpEx budgets, he added.

The new AMD EPYC 9x84X processors also use the new AMD Zen 4c cores, the company’s new EPYC processors optimized for cloud-native workloads. The 94X8X CPUs are also socket-compatible with earlier Genoa processors. And they offer security protection with AMD Infinity Guard, the company’s suite of hardware-level security features.

It’s worth noting that AMD last year introduced a similar optimization for its Milan series processors. Those processors were code-named Milan-X.

Total ecosystem

To create a complete technical-computing environment, AMD has been working closely with developers of highly technical software. These partners include Altair, Ansys, Cadence, Dassault Systemes, Siemens and Synopsys.

Hardware partners are jumping in, too. Supermicro recently announced that its entire line of Supermicro H13 AMD-based systems now support 4th gen AMD EPYC processors with AMD 3D V-cache technology.

As this table shows, courtesy of AMD, the AMD EPYC 9x84X series now comes in 3 SKUs:

In addition, all 3 SKUs support both DDR5 memory and PCIe 5.0 connectivity.

The new AMD EPYC 9x84X processors are available now. OEM systems based on these processors are expected to start shipping in the third quarter.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

AMD intros CPUs, cache, AI accelerators for cloud, enterprise data centers

Featured content

AMD intros CPUs, cache, AI accelerators for cloud, enterprise data centers

AMD strengthens its commitment to the cloud and enterprise data centers with new "Bergamo" CPUs, "Genoa-X" cache, Instinct accelerators.

Learn More about this topic
  • Applications:
  • Featured Technologies:

This week AMD strengthened its already strong commitment to the cloud and enterprise markets. The company announced several new products and partnerships at its Data Center and AI Technology Premier event, which was held in San Francisco and simultaneously broadcast online.

“We’re focused on pushing the envelope in high-performance and adaptive computing,” AMD CEO Lisa Su told the audience, “creating solutions to the world’s most important challenges.”

Here’s what’s new:

Bergamo: That’s the former codename for the new 4th gen AMD EPYC 97X4 processors. AMD’s first processor designed specifically for cloud-native workloads, it packs up to 128 cores per socket using AMD’s new Zen 4c design to deliver lots of power/watt. Each socket contains 8 chiplets, each with up to 16 Zen 4c cores; that’s twice as many cores as AMD’s earlier Genoa processors (yet the two lines are compatible). The entire lineup is available now.

Genoa-X: Another codename, this one is for AMD’s new generation of AMD 3D V-Cache technology. This new product, designed specifically for technical computing such as engineering simulation, now supports over 1GB of L3 cache on a 96-core CPU. It’s paired with the new 4th gen AMD EPYC processor, including the high-performing Zen4 core, to deliver high performance/core.

“A larger cache feeds the CPU faster with complex data sets, and enables a new dimension of processor and workload optimization,” said Dan McNamara, an AMD senior VP and GM of its server business.

In all, there are 4 new Genoa-X SKUs, ranging from 16 to 96 cores, and all socket-compatible with AMD’s Genoa processors.

Genoa: Technically, not new, as this family of data-center CPUs was introduced last November. But what is new is AMD’s new focus for the processors on AI, data-center consolidation and energy efficiency.

AMD Instinct: Though AMD had already introduced its Instinct MI300 Series accelerator family, the company is now revealing more details.

This includes the introduction of the AMD Instinct MI300X, an advanced accelerator for generative AI based on AMD’s CDNA 3 accelerator architecture. It will support up to 192GB of HBM3 memory to provide the compute and memory efficiency needed for large language model (LLM) training and inference for generative AI workloads.

AMD also introduced the AMD Instinct Platform, which brings together eight MI300X accelerators into an industry-standard design for the ultimate solution for AI inference and training. The MI300X is sampling to key customers starting in Q3.

Finally, AMD also announced that the AMD Instinct MI300A, an APU accelerator for HPC and AI workloads, is now sampling to customers.

Partner news: Mark your calendar for June 20. That’s when Supermicro plans to explore key features and use cases for its Supermicro 13 systems based on AMD EPYC 9004 series processors. These Supermicro systems will feature AMD’s new Zen 4c architecture and 3D V-Cache tech.

This week Supermicro announced that its entire line of H13 AMD-based systems are now available with support for the 4th gen AMD EPYC processors with Zen 4c architecture and V-Cache technology.

That includes Supermicro’s new 1U and 2U Hyper-U servers designed for cloud-native workloads. Both are equipped with a single AMD EPYC processor with up to 128 cores.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Why your AI systems can benefit from having both a GPU and CPU

Featured content

Why your AI systems can benefit from having both a GPU and CPU

Like a hockey team with players in different positions, an AI system with both a GPU and CPU is a necessary and winning combo. This mix of processors can bring you and your customers both the lower cost and greater energy efficiency of a CPU and the parallel processing power of a GPU. With this team approach, your customers should be able to handle any AI training and inference workloads that come their way.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Sports teams win with a range of skills and strengths. A hockey side can’t win if everyone’s playing goalie. The team also needs a center and wings to advance the puck and score goals, as well as defensive players to block the opposing team’s shots.

The same is true for artificial intelligence systems. Like a hockey team with players in different positions, an AI system with both a GPU and CPU is a necessary and winning combo.

This mix of processors can bring you and your customers both the lower cost and greater energy efficiency of a CPU and the parallel processing power of a GPU. With this team approach, your customers should be able to handle any AI training and inference workloads that come their way.

In the beginning

One issue: Neither CPUs nor GPUs were originally designed for AI. In fact, both designs predate AI by many years. Their origins still define how they’re best used, even for AI.

GPUs were initially designed for computer graphics, virtual reality and video. Getting pixels to the screen is a task where high levels of parallelization speed things up. And GPUs are good at parallel processing. This has allowed them to be adapted for HPC and AI workloads, which analyze and learn from large volumes of data. What’s more, GPUs are often used to run HPC and AI workloads simultaneously.

GPUs are also relatively expensive. For example, Nvidia’s new H100 has an estimated retail price of around $25,000 per GPU. Your customers may incur additional costs from cooling—GPUs generate a lot of heat. GPUs also use a lot of power, which can further raise your customer’s operating costs.

CPUs, by contrast, were originally designed to handle general-purpose computing. A modern CPU can run just about any type of calculation, thanks to its encompassing instruction set.

A CPU processes data sequentially, rather than in parallel, and that’s good for linear and complex calculations. Compared with GPUs, a comparable CPU generally is less expensive, needs less power and runs cooler.

In today’s cost-conscious environment, every data center manager is trying to get the most performance per dollar. Even a high-performing CPU has a cost advantage over comparable GPUs that can be extremely important for your customers.

Team players

Just as a hockey team doesn’t rely on its goalie to score points, smart AI practitioners know they can’t rely on their GPUs to do all types of processing. For some jobs, CPUs are still better.

Due to a CPU’s larger memory capacity, they’re ideal for machine learning training and inference, as long as the scale is relatively small. CPUs are also good for training small neural networks, data preparation and feature extraction.

CPUs offer other advantages, too. They’re generally less expensive than GPUs. In today’s cost-conscious environment, where every data center manager is trying to get the most performance per dollar, that’s extremely important. CPUs also run cooler than GPUs, requiring less (and less expensive) cooling.

GPUs excel in two main areas of AI: machine learning and deep learning (ML/DL). Both involve the analysis of gigabytes—or even terabytes—of data for image and video processing. For these jobs, the parallel processing capability of a GPU is a perfect match.

AI developers can also leverage a GPU’s parallel compute engines. They can do this by instructing the processor to partition complex problems into smaller, more manageable sub-problems. Then they can use libraries that are specially tuned to take advantage of high levels of parallelism.

Theory into practice

That’s the theory. Now let’s look at how some leading AI tech providers are putting the team approach of CPUs and GPUs into practice.

Supermicro offers its Universal GPU Systems, which combine Nvidia GPUs with CPUs from AMD, including the AMD EPYC 9004 Series.

An example is Supermicro’s H13 GPU server, with one model being the AS 8215GS-TNHR. It packs an Nvidia HGX H100 multi-GPU board, dual-socket AMD EPYC 9004 series CPU, and up to 6TB of DDR5 DRAM memory.

For truly large-scale AI projects, Supermicro offers SuperBlade systems designed for distributed, midrange AI and ML training. Large AI and ML workloads can require coordination among multiple independent servers, and the Supermicro SuperBlades are designed to do just that. Supermicro also offers rack-scale, plug-and-play AI solutions powered by the company’s GPUs and turbocharged with liquid cooling.

The Supermicro SuperBlade is available with a single AMD EYPC 7003/7002 series processors with up to 64 cores. You also get AMD 3D V-Cache, up to 2TB of system memory per node, and a 200Gbps InfiniBand HDR switch. Within a single 8U enclosure, you can install up to 20 blades.

Looking ahead, AMD plans to soon ship its Instinct MI300A, an integrated data-center accelerator that combines three key components: AMD Zen 4 CPUs, AMD CDNA3 GPUs, and high-bandwidth memory (HBM) chiplets. This new system is designed specifically for HPC and AI workloads.

Also, the AMD Instinct MI300A’s high data throughput lets the CPU and GPU work on the same data in memory simultaneously. AMD says this CPU-GPU partnership will help users save power, boost performance and simplify programming.

Truly, a team effort.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

How ILM creates visual effects faster & cheaper with AMD-powered Supermicro hardware

Featured content

How ILM creates visual effects faster & cheaper with AMD-powered Supermicro hardware

ILM, the visual-effects company founded by George Lucas, is using AMD-powered Supermicro servers and workstations to create the next generation of special effects for movies and TV.

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD and Supermicro are helping Industrial Light & Magic (ILM) create the future of visual movie and TV production.

ILM is the visual-effects company founded by George Lucas in 1975. Today it’s still on the lookout for better, faster tech. And to get it, ILM leans on Supermicro for its rackmount servers and workstations, and AMD for its processors.

The servers help ILM reduce render times. And the workstations enable better collaboration and storage solutions that move data faster and more efficiently.

All that high-tech gear comes together to help ILM create some of the world’s most popular TV series and movies. That includes “Obi-Wan Kenobi,” “Transformers” and “The Book of Boba Fett.”

It’s a huge task. But hey, someone’s got to create all those new universes, right?

Power hungry—and proud of it

No one gobbles up compute power quite like ILM. Sure, it may have all started with George Lucas dropping an automotive spring on a concrete floor to create the sound of the first lightsaber. But these days, it’s all about the 1s and 0s—a lot of them.

An enormous amount of compute power goes into rendering computer-generated imagery (CGI) like special effects and alien characters. So much power, in fact, that it can take weeks or even months to render an entire movie’s worth of eye candy.

Rendering takes not only time, but also money and energy. Those are the three resources that production companies like ILM must ration. They’re under pressure to manage cash flow and keep to tight production schedules.

By deploying Supermicro’s high-performance and multinode servers powered by AMD’s EPYC processors , ILM gains high core counts and maximum throughput—two crucial components of faster rendering.

Modern filmmakers are also obliged to manage data. Storing and moving terabytes of rendering and composition information is a constant challenge, especially when you’re trying to do it quickly and securely.

The solution to this problem comes in the form of high-performance storage and networking devices. They can shift vast swaths of information from here to there without bottlenecks, overheating or (worst-case scenario) total failure.

EPYC stories

This is the part of the story where CPUs take back some of the spotlight. GPUs have been stealing the show ever since data scientists discovered that graphic processors are the keys to unlocking the power of AI. But producing the next chapter of the “Star Wars” franchise means playing by different rules.

AMD EPYC processors play a starring role in ILM’s render farms. Render farms are big collections of networked server-class computers that work as a team to crunch a metric ton of data.

A typical ILM render farm might contain dozens of high-performance computers like the Supermicro BigTwin. This dual-node processing behemoth can house two 3rd gen AMD EPYC processors, 4TB of DDR5 memory per node and a dozen 2.5-inch hot-swappable solid-state drives (SSDs). In case the specs don’t speak for themselves, that’s an insane amount of power and storage.

For ILM, lighting and rendering happen inside an application by Isotropix called Clarisse. Our hero, Clarisse, relies on CPU rather than GPU power. Unlike most 3D apps, which are single-threaded, Clarisse also features unusually efficient multi-threading.

This lets the application take advantage of the parallel-processing power in AMD’s EPYC CPUs to complete more tasks simultaneously. The results: shorter production times and lower costs.

Coming soon: StageCraft

ILM is taking its tech show on the road with an end-to-end virtual production solution called StageCraft. It exists as both a series of Los Angeles and Vancouver-based sites—ILM calls them “volumes”—as well as mobile pop-up volumes waiting to happen anywhere in the United States and Europe.

The introduction of StageCraft is interesting for a couple of reasons. For one, this new production environment makes ILM’s AMD-powered magic wand accessible to a wider range of directors, producers and studios.

For another, StageCraft could catalyze the proliferation of cutting-edge creative tech. This, in turn, could lead to the same kind of competition, efficiency increases and miniaturization that made 4K filmmaking a feature of everyone’s mobile phones.

StageCraft could also usher in a new visual language. The more people with access to high-tech visualization technology, the more likely it is that some unknown aspiring auteur will pop up, seemingly out of nowhere, to change the nature of entertainment forever.

Kinda’ like how George Lucas did it back in the day.

Do more:

 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

A hospital’s diagnosis: Professional AI workloads require professional hardware

Featured content

A hospital’s diagnosis: Professional AI workloads require professional hardware

A Taiwanese hospital’s initial use of AI to interpret medical images with consumer graphics cards fell short. The prescription? Supermicro workstations powered by AMD components. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

A Taiwanese hospital has learned that professional AI workloads are too much to handle for consumer-level hardware—and that pro-level workloads require pro-level systems.

When Shuang-Ho Hospital first used AI to interpret medical images, it relied on consumer graphics cards installed on desktop PCs. But staff found that for diagnostics imaging, the graphics cards performed poorly. Plus, the memory capacity of the PCs was insufficient. The result: image resolution too low to be useful.

The hospital, affiliated with Taipei Medical University, offers a wide range of services, including reproductive medicine, a sleep center, and treatment for cancer, dementia and strokes. It opened in 2008 and is located in New Taipei City.

In its quest to use AI for healthcare, Shuang-Ho Hospital is far from alone. Last year, global sales of healthcare AI totaled $15.4 billion, estimates Grand View Research. Looking ahead, the market watcher expects healthcare AI sales through 2030 to enjoy a compound annual growth rate (CAGR) of nearly 38%.

A subset of that market, AI for diagnostic imaging, is a big and fast-growing field. The U.S. government has approved nearly 400 AI algorithms for radiology, according to the American Hospital Association. And the need is great. The World Economic Forum estimates that of all the data produced by hospitals each year—roughly 50 petabytes—97% goes unused.

‘Just right’

Shuang-Ho Hospital knew it needed an AI system that was more robust. But initially it wasn’t sure where to turn. A Supermicro demo changed all that. “The workstation presented by Supermicro was just right for our needs,” says Dr. Yen-Ting Chen, an attending physician in the hospital’s medical imaging department.

Supermicro’s solution for the hospital was its AS-5014-TT SuperWorkstation, powered by AMD’s Ryzen Threadripper Pro 3995WX processor and equipped with a pair of AMD Radeon Pro W6800 professional graphics cards. This tower workstation is optimized for applications that include AI and deep learning.

For the hospital, one especially appealing feature is the Supermicro workstation’s use of a multicore processor that can be paired with multiple GPU cards. The AMD Threadripper Pro has 64 cores, and each of the hospital’s Supermicro workstations was configured with two GPUs.

Another attractive feature had nothing to do with tech specs. “The price was very reasonable,” says Dr. Yen-Ting Chen. “It naturally became our best choice.”

Smart tech, healthier brains

Now that Shuang-Ho Hospital has the AMD-powered Supermicro workstations installed, the advantages of a professional system over consumer products has become even clearer. For one, AI training is much better than it was with the consumer cards.

Even more important, the images from brain tomography, which with the consumer cards had to be degraded, can now be used at full resolution. (Tomography is an approach to imaging that combines scans taken from different angles to create cross-sectional “slices.”)

For now, the hospital is using the Supermicro workstations to help interpret scans for cerebral thrombosis, a serious health condition involving a blood clot in a vein of the brain. Learnings from this first AI workload are being shared with other departments.

Long-term, the hospital plans to use AI wherever the technology can help. And this time, with strictly professional hardware.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Do you know why 64 cores really matters?

Featured content

Do you know why 64 cores really matters?

In a recent test, Supermicro workstations and servers powered by 3rd gen AMD Ryzen Threadripper PRO processors ran engineering simulations nearly as fast as a dual-processor system, but needed only two-thirds as much power.

Learn More about this topic
  • Applications:
  • Featured Technologies:

More cores per CPU sounds good, but what does it actually mean for your customers?

In the case of certain Supermicro workstations and servers powered by 3rd gen AMD Ryzen Threadripper PRO processors, it means running engineering simulations with dual-processor performance from a single-socket system. And with further cost savings due to two-thirds lower power consumption.

That’s according to tests recently conducted by MVConcept, a consulting firm that provides hardware and software optimizations. The firm tested two Supermicro systems, the AS-5014A-TT SuperWorkstation and AS-2114GT-DPNR server.

A solution brief based on MVConcept’s testing is now available from Supermicro.

Test setup

For these tests, the Supermicro server and workstation were both tested in two AMD configurations:

  • One with the AMD Ryzen Threadripper PRO 5995WX processor
  • The other with an older, 2nd gen AMD Ryzen Threadripper PRO 3995WX processor

In the tests, both AMD processors were used to run 32-core as well as 64-core operations.

The Supermicro systems were tested running Ansys Fluent, fluid simulation software from Ansys Inc. Fluent models fluid flow, heat, mass transfer and chemical reactions. Benchmarks for the testing included aircraft wing, oil rig and pump.

The results

Among the results: The Supermicro systems delivered nearly dual-CPU performance with a single processor, while also consuming less electricity.

What’s more, the 3rd generation AMD 5995WX CPU delivered significantly better performance than the 2nd generation AMD 3995WX.

Systems with larger cache saw performance improved the most. So a system with L3 cache of 256MB outperformed one with just 128MB.

BIOS settings proved to be especially important for realizing the optimal performance from the AMD Ryzen Threadripper PRO when running the tested applications. Specifically, Supermicro recommends using NPS=4 and SMT=OFF when running Ansys Fluent with AMD Ryzen Threadripper PRO. (NPS = non-uniform memory access (NUMA) per socket; and SMT = symmetric multithreading.)

Another cool factor involves taking advantage of the Supermicro AS-2114GT-DPNR server’s two hot-pluggable nodes. First, one node can be used to pre-process the data. Then the other node can be used to run Ansys Fluid.

Put it all together, and you get a powerful takeaway for your customers: These AMD-powered Supermicro systems offer data-center power on both the desktop and server rack, making them ideal for SMBs and enterprises alike.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Research roundup: PICaaS rising, IT spending stays strong, new data-center components emerge

Featured content

Research roundup: PICaaS rising, IT spending stays strong, new data-center components emerge

Do you know how the latest IT market research could help you and your business?

Learn More about this topic
  • Applications:

It’s time to consider performance intensive computing as a service. Get ready for a modest spending surge. And be on the lookout for new data-center components.

Those are takeaways from the latest in IT market research and analysis. And here’s your tech partner’s roundup.

Performance intensive computing: now as a service

If you don’t offer cloud-based performance intensive computing as a service, you might want to consider doing so. The market, already big, is growing fast.

Sales of performance intensive computing as a service (PICaaS) will rise from $22.3 billion worldwide in 2021 to $103 billion by 2027, predicts market watcher IDC. That’s a compound annual growth rate (CAGR) of nearly 28%.

With PICaaS, customers use public cloud services to run the mathematically intensive computations needed for AI, HPC, big data analytics, and engineering and technical applications.

Driving the market are two factors, IDC says. One, performance intensive computing is going mainstream and is increasingly mission critical. And two, a growing number of businesses define themselves as digital.

What can you do to get ready for this market? Among other tactics, IDC recommends that suppliers formulate an end-to-end bundled PICaaS offering, demonstrate a secure cloud infrastructure, and become trusted advisors of hybrid development models.

Strong IT spending — this year and next

What kind of year will 2023 shape up to be? If your customers are like most, pretty good. Overall IT spending will rise this year by 5.5%, reaching a grand total of $4.6 trillion, predicts analyst firm Gartner, and some segments will rise by much more.

But what about sales dips, tech layoffs and other financial issues? “Macroeconomic headwinds are not slowing digital transformation,” insists Gartner analyst John-David Lovelock. “IT spending will remain strong.”

On the hardware front, Gartner expects data center systems sales worldwide this year to rise by less than 4%. Next year looks better with a projected rise of about 6%.

IT services are in demand. Sales will rise by just over 9% this year, Gartner forecasts, and by about 10% next year.

Devices such as PCs and smartphones are a weak point, with sales projected to drop by nearly 5% this year after tumbling nearly 11% last year. Next year, sales should pick up, Gartner expects, rising an impressive 11%.

New components coming to customer data centers

Have you and your data-center customers spoken yet about three components—SmartNICs, data processing units (DPUs) and infrastructure processing units (IPUs)?

If not, you probably will soon, according to ABI Research. Demand for these components is being driven by two factors: specialized workloads such as AI, IoT and 5G; and the rise of cloud hyperscalers such as AWS, Azure and Google Cloud.

“Organizations are exploring the feasibility of running specific applications that require high processing power on public-cloud data centers to ensure business continuity,” says ABI analyst Yih-Khai Wong.

Big opportunities include networks, cloud platforms and security. For example, AMD’s Xilinx Alveo line of adaptable accelerator cards includes the industry’s first software-defined, hardware-accelerated SmartNIC.

To be sure, the shift is still in its early stages. But Wong says servers equipped by default with SmartNICs, DPUs or IPUs are coming “sooner rather than later.”

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Pages