Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Tech Explainer: Why embedded systems for retail?

Featured content

Tech Explainer: Why embedded systems for retail?

You’ll find embedded systems in thousands of retail locations—but only if you know where to look. Find out how these specialized servers work, why they make sense, and how your retail customers can get started using them.

Learn More about this topic
  • Applications:
  • Featured Technologies:

While dedicated high-performance servers and multiuse cloud platforms command the biggest headlines in tech news, that doesn’t mean they’re the perfect fit for every use case.

Retail organizations have unique requirements. Sometimes those requirements are best served by the diminutive, unsung heroes of the server world: embedded retail servers.

Embedded systems are usually smaller and less powerful than their larger, purpose-built cousins. So where giant AI servers may offer the brute force power of a freight train, smaller embedded systems are more like a ski lift. They do only one thing, but they do it very well.

You can find embedded systems in thousands of retail sites, but you’ll have to do some hunting—their location is not always obvious. Some embedded retail servers sit under counters or in small, out-of-the-way closets. Others are attached to the backs of large color displays that offer patrons dynamic menus, ads and special deals.

High-Tech Sales

One of the most common retail embedded systems is the humble point-of-sale (POS) terminal. A quick survey of your favorite retail stores is likely to reveal a variety of versions, ranging from smart cash registers to fully autonomous self-checkout kiosks.

But POS devices are designed to do far more than just add prices and calculate tax. In a modern retail setting, these servers may also read barcodes, weigh items, process mobile payments, update inventory, schedule deliveries, and detect fraud.

These processes can become even more demanding when the embedded system must complete them without the aid of cloud services.

Why? Because without the processing power and storage of remote cloud and core servers, the embedded system has to rely on its own internal components to complete what can often be a series of very demanding tasks.

Other Use Cases

Deploying embedded retail systems becomes even more complex when a retail location doubles as a warehouse. Such is the case with supermarkets and big-box retailers like Walmart. They must be able to quickly restock their shelves whenever supplies are depleted by shoppers.

In these locations, you can often find embedded retail servers keeping track of real-time stock levels. This can be accomplished using a number of methods, including radio frequency identification (RFID) tags, shelf-based weight sensors, and AI-enabled cameras.

Another task best handled by small, embedded systems is building and energy management. Retail operations often use embedded servers connected to distributed sensors to control HVAC, lighting and security. Here, again, it’s vitally important that these systems be able to operate without an internet connection when necessary.

In this case, an embedded server’s ability to operate on its own can actually prevent physical disasters. Even deprived of remote cloud services, it may need to keep control over a store’s climate to prevent damaging stock. Likewise, store managers often rely on an embedded server’s ability to maintain 100% security system uptime to avoid theft, damage or fire.

Power to Get the Job Done

Designers of embedded retail servers have a tricky job. They need to create systems that meet a long list of disparate requirements. That’s because the most effective embedded retail servers are:

  •  Compact enough to fit in small retail outlets
  • Cost-effective enough that enterprises can outfit each location with multiple servers
  •  Powerful enough to handle multiple complex tasks and run AI applications locally
  • Outfitted with enough storage to collect terabytes of data
  • Reliable enough to run security services that store managers can rely on 24x7
  • Able to reliably perform with or without an internet connection

To address these concerns, systems designers like Supermicro are tasked with creating the perfect balance of power, pricing, and reliability.

One such well-balanced embedded server is the upcoming Supermicro IoT A+ Server (AS-E300-14GR). It’s a mini-1U server powered by AMD EPYC 4004/4005 series processors with 16 cores and a 64MB cache.

Despite its small size, Supermicro’s embedded system still manages to offer some real expansion. For example, users can have Supermicro populate the server with up to 960GB of SSD storage and 192GB of DDR5 RAM. They can also opt for additional storage via the system’s dual M.2 PCIe 5.0 x4 NVMe slots.

In addition, there’s a single PCIe 5.0 x16 LP slot for an expansion card. Common options to fill that slot include PCIe-based networking cards and dedicated AI accelerators like AMD’s Instinct GPUs.

Coming Soon

What kind of features can we expect from future generations of embedded retail servers? The answer will have much to do with consumer shopping habits, economic and market shifts, and new tech that becomes available in the near future.

While we can’t make infallible predictions about those forces, we can make some assumptions.

One is that AI will become deeply integrated in embedded systems. In fact, we could soon see more systems with AI fully on-device—no cloud connection necessary.

Connectivity-wise, future embedded systems could feature not only Wi-Fi 7 integration, but also 5G cellular connections.

Embedded systems’ footprints should also shrink, even as they become more powerful. Ultra-low-power chips should enable them to operate silently with passive cooling systems and improved thermal management, which will allow designers to shrink the server’s overall size.

Bottom line: Expect embedded servers for retail to become smaller, faster and better. Isn’t that always the way when it comes to new technology?

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

To supercharge AI clusters, check out a newly validated solution from AMD, Supermicro & Mirantis

Featured content

To supercharge AI clusters, check out a newly validated solution from AMD, Supermicro & Mirantis

Validating Supermicro hardware with Mirantis k0rdent AI represents a shift from building clusters to composing them.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Full-stack AI infrastructure solutions are having a moment. And why not. Organizations choose these solutions to speed GPU operations, ensure efficient GPU utilization, and enforce security and compliance at scale.

One such solution is k0rdent AI, a turnkey, production-ready “super control plane” for managing complex AI environments. K0rdent automates provisioning, life-cycle management, and orchestration of infrastructure and core services.

The company behind k0rdent is Mirantis Inc. It’s privately held and based in Campbell, Calif. Founded in 2011, Mirantis today has over 800 employees.

Importantly, Mirantis is also a contributor to Kubernetes, the open-source system for automating the deployment, scaling, and management of containerized applications. Containerization is a software-deployment process that creates a single software package, known as a container, that can run on all types of devices and operating systems.

Mirantis helps organizations achieve digital self-determination by giving them complete control over their strategic infrastructure. The company’s customers include such well-known brands as Adobe, DocuSign and PayPal.

Could Supermicro benefit from the solution’s capabilities? To find out, Supermicro recently validated its modular server architecture with k0rdent.

Testing, Testing

For the validation, Supermicro used two of its own systems:

  • A Supermicro 8U GPU server (model AS -8126GS-TNMR) powered by dual AMD EPYC 9005 CPUs and up to eight AMD Instinct MI325X GPUs.
  • A Supermicro 2U Big Twin server (model AS -2124BT-HNTR) powered by dual AMD EPYC 7003 processors.

Validation began at the physical level, where the k0rdent bare-metal operator acts as a bridge between the Kubernetes API and the Supermicro servers. This delivered automated BIOS configuration, firmware updates, RAID orchestration, and deployment of a hardened host OS.

Next, the testing team deployed the AMD GPU Operator via the k0rdent catalog. GPU Operator simplifies the deployment and management of AMD Instinct GPUs with Kubernetes clusters, enabling seamless configuration and operation of GPU-accelerated workloads.

The AMD Network Operator was deployed, too. It's a control component that enables GPU-to-GPU communications in an AI cluster, managing AMD NICs in Kubernetes clusters.

Here was the test configuration:

  • Scope: Single GPU unit performance

The testers used a custom PyTorch script to measure raw compute throughput across different precisions. (PyTorch is an open-source deep learning library.)

Results Delivered

The validation successfully demonstrated the automated provisioning of production-grade Kubernetes clusters on Supermicro bare-metal hardware using k0rdent’s declarative orchestration engine and the Bare Metal Operator (BMO).

k0rdent managed the entire lifecycle of the Supermicro nodes. That went from out-of-band discovery via BMC/IPMI (Baseboard Management Controller/Intelligent Platform Management Interface) and hardware introspection…all the way to automated OS imaging and Kubernetes bootstrapping.

This eliminated manual configuration and hypervisor overhead. It also provided a high-performance, consistent, and repeatable deployment model that adheres to Cluster API (CAPI) standards.

As Supermicro explains, the validation confirms that k0rdent effectively bridges the gap between physical server management and cloud-native agility. That makes it an ideal solution for resource-intensive workloads requiring direct hardware access and deterministic performance on Supermicro infrastructure.

Conclusions

Validating Supermicro hardware with Mirantis k0rdent AI represents a shift from building clusters to composing them.

Enterprises can run their entire portfolios—from legacy apps to cutting-edge LLMs—on a single, unified, bare-metal platform with automatic deployment and comprehensive platform management from the bare metal up.

If you have customers eager to eliminate human error and inconsistencies from the AI deployment and management processes, tell them to check out this solution.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Tech Explainer: What’s an AI Factory?

Featured content

Tech Explainer: What’s an AI Factory?

Discover how AI factories work—and how your clients might benefit from building an AI factory of their own.

Learn More about this topic
  • Applications:
  • Featured Technologies:

How can you tell that the AI Era is here? One way is by noticing that large enterprises are increasingly focused on mass producing AI models.

It’s no longer enough to have a decent set of working AI models to power Spotify’s suggestion engine or Accenture’s Big Data analytics.

To keep up with—and surpass—the Joneses, Spotify and Accenture will need dedicated systems that work every day to create, evaluate and iterate their AI models.

These systems are called AI factories. Somewhat like a factory that creates physical widgets, an AI factory churns out new and updated AI models. This continual AI production process helps enterprises react quickly to market demands and competition.

Make no mistake: The development of AI factories represents a turning point in the evolution of AI-powered business.

No. 2 with a Bullet

This theory is supported by some of IT’s top thinkers. They include Tom Davenport, a professor, speaker and author; and Randy Bean, a corporate advisor.

Davenport and Bean co-wrote an article that appeared earlier this month In the Sloan Management Review: Five trends in AI and data science for 2026. In their article, the authors place AI factories in the Number 2 spot. AI factories, they say, will be adopted by users and “all-in” AI adopters that include consumer products makers, banks and software companies.

As Davenport and Bean explain, an AI factory combines technology platforms, methods, data and previously developed algorithms to make building AI systems easy and fast. The authors’ all-important message: Watch this space.

How AI Factories Work

To fully understand the concept of an AI factory, it can help to think of the traditional smoke-belching, brick-and-mortar factories it’s named for.

Of course, there are some differences. A physical factory takes in raw materials, uses machines to process them, and produces physical products.

By contrast, an AI factory takes in data (such as text, audio, images and logs), runs that data through massive compute engines, and outputs AI models for recommendations, predictions, automation and generative content.

Another difference: Unlike the static products that emerge from traditional factories, the products of AI factories are virtual. They learn and grow as new data, infrastructure and techniques become available. In this way, AI factories help their organizations keep up with rapid changes and market shifts.

For instance, a new AI model produced by an enterprise’s AI factory can be continuously retrained as new data becomes available. While each new iteration deployed in the field busily suggests which Netflix movie to watch next, a newer version is constantly being developed in the background. When the new suggestion engine is ready, Netflix can seamlessly slide it into place.

Why Your Clients Probably Need an AI Factory

It’s good to understand the abstract benefits of an AI factory. But your clients will also want to know how building one can translate into business results.

Here’s the bottom line. An AI factory can:

  • Dramatically reduce the cost of business intelligence. Once an AI factory is built and a given AI model is trained, that model can run continuously, serving millions of decisions, predictions, etc., for a fraction of its initial cost. In other words, the cost per additional decision rapidly collapses toward zero.
  • Help organizations maintain a decisive competitive advantage. This happens on two levels. First, maintaining a constant production stream of AI models and iterations helps your clients meet market demands as quickly as possible. And second, having that ability to react faster to customer needs and economic conditions can help create and sustain an advantage over competitors.
  • Turn data into capital. Many organizations are ill-equipped to analyze and monetize all the data they collect. All that piled-up data can seem like an albatross around their neck. But by building an AI factory, the organization can harness that otherwise squandered data and put it to work.

Further, companies that don’t build an AI factory could find themselves at a competitive disadvantage. Davenport and Bean, in their Sloan Management Review article, say companies that lack an AI factory will find building AI at scale both expensive and time-consuming.

Stumbling Blocks? A Couple

Building an AI factory isn’t always easy. Enterprises can run into serious roadblocks.

For one, siloed, inconsistent or low-trust data can make for a messy AI production process. As programmers say, “garbage in, garbage out.” In other words, if the data is messy, the analysis will be, too.

Another thing that can wreak havoc on the virtual factory floor are talent bottlenecks. There are only so many data scientists to go around, and they’re in high demand. Finding the right employees is a key component here—even in an age of super-smart robots.

Another trap your clients need to watch out for are bureaucratic hold-ups. Legal, compliance and trust issues can cause AI projects to grind to a halt.

The AI Factory Future

Like everything else in the fast-moving AI world, AI factories are changing. In the near future, AI factories will likely focus on the immediacy of real-time, always-on learning.

As AI factories shift to nearly continuous adaptation, enterprises will use their AI model updates to keep pace with rapidly changing market conditions and customer demands.

Another likely future is inferencing at the edge. For “edge,” think vehicles, devices and brick-and-mortar factories. Organizations that move inferencing closer to where data is created can lower system latency (that is, increase speed) and reduce cloud costs.

Another factor that could make a big impact on AI factories is new software and hardware integrations. A recent Supermicro webinar on AI factories and related technology showed how enterprises can benefit from integrating software platforms such as Supermicro’s SuperCloud Composer (SCC) and Power Asset Orchestrator (PAO).

Supermicro says this potent combination allows operators to gain total visibility into AI Factories. It can also optimize everything from GPU telemetry to real-time grid pricing.

Overall, it’s safe to assume that when these and other updates are deployed, AI factories will quickly become part of the common AI infrastructure. In so doing, they’ll touch nearly every aspect of our daily lives.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Faster is better. Supermicro with 5th Gen AMD is faster

Featured content

Faster is better. Supermicro with 5th Gen AMD is faster

Supermicro servers powered by the latest AMD processors are up to 9 times faster than a previous generation, according to a recent benchmark.

Learn More about this topic
  • Applications:
  • Featured Technologies:

When it comes to servers, faster is just about always better.

With faster processors, workloads get completed in less time. End users get their questions answered sooner. Demanding high-performance computing (HPC) and AI applications run more smoothly. And multiple servers get all their jobs done more rapidly.

And if you’ve installed, set up or managed one of these faster systems, you’ll look pretty smart.

That’s why the latest benchmark results from Supermicro are so impressive, and also so important.

The tests show that Supermicro servers powered by the latest AMD processors are up to 9 times faster than a previous generation. These systems can make your customer happy—and make you look good.

SPEC Check

The benchmark in question are those of the Standard Performance Evaluation Corp., better known as SPEC. It’s a nonprofit consortium that sets benchmarks for running complete applications.

Supermicro ran its servers on SPEC’s CPU 2017 benchmark, a suite of 43 benchmarks that measures and compare compute-intensive performance. All of them stress a system’s CPU, memory subsystem and compiler—emphasizing all three of these components working together, not just the processor.

To provide a comparative measure of integer and floating-point compute-intensive performance, the benchmark uses two main metrics. The first is speed, or how much time a server needs to complete a single task. The second is throughput, in which the server runs multiple concurrent copies.

The results are given as comparative scores. In general, higher is better.

Super Server

The server tested was the Supermicro H14 Hyper server, model number AS 2126HS-TN. It’s powered by dual AMD EPYC 9965 processors and loaded with 1.5TB of memory.

This server has been designed for applications that include HPC, cloud computing, AI inferencing and machine learning.

In the floating-point measure, the new server, when compared with a SMC server powered by an earlier-gen AMD EPYC 7601, was 8x faster.

In the Integer Rate measure, compared with a circa 2018 SMC server, it’s almost 9x faster.

Impressive results. And remember, when it comes to servers, faster is better.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Supermicro JumpStart remote test site adds latest 5th Gen AMD EPYC processors

Featured content

Supermicro JumpStart remote test site adds latest 5th Gen AMD EPYC processors

Register now to test the Supermicro H14 2U Hyper with dual AMD EPYC 9965 processors from the comfort and convenience of your office.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Supermicro’s JumpStart remote test site will soon let you try out a server powered by the new 5th Gen AMD EPYC processors from any location you choose.

The server is the Supermicro H14 2U Hyper with dual AMD EPYC 9965 processors. It will be available for remote testing on the Supermicro JumpStart site starting on Dec. 2. Registration is open now.

The JumpStart site lets you use a Supermicro server solution online to validate, test and benchmark your own workloads, or those of your customers. And using JumpStart is free.

All test systems on JumpStart are fully configured with SSH (the Secure Socket Shell network protocol); VNC (Virtual Network Computing remote-access software); and Web IPMI (the Intelligent Platform Management Interface). During your test, you can open one session of each.

Using the Supermicro JumpStart remote testing site is simple:

Step 1: Select the system you want to test, and the time slot when you want to test it.

Step 2: At the scheduled time, login to the JumpStart site using your Supermicro single sign-on (SSO) account. If you don’t have an account yet, create one and then use it to login to JumpStart. (Creating an account is free.)

Step 3: Use the JumpStart site to validate, test and benchmark your workloads!

Rest assured, Supermicro will protect your privacy. Once you’re done testing a system on JumpStart, Supermicro will manually erase the server, reflash the BIOS and firmware, and re-install the OS with new credentials.

Hyper power

The AMD-powered server recently added to JumpStart is the Supermicro H14 2U Hyper, model number AS -2126HS-TN. It’s powered by dual AMD EPYC 9965 processors. Each of these CPUs offers 192 cores and a maximum boost clock of 3.7 GHz.

This Supermicro server also features 3.8TB of storage and 1.5TB of memory. The system is built in the 2U rackmount form factor.

Are you eager to test this Supermicro server powered by the latest AMD EPYC CPUs? JumpStart is here to help you.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Supermicro FlexTwin now supports 5th gen AMD EPYC CPUs

Featured content

Supermicro FlexTwin now supports 5th gen AMD EPYC CPUs

FlexTwin, part of Supermicro’s H14 server line, now supports the latest AMD EPYC processors — and keeps things chill with liquid cooling.

 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Wondering about the server of the future? It’s available for order now from Supermicro.

The company recently added support for the latest 5th Gen AMD EPYC 9005 Series processors on its 2U 4-node FlexTwin server with liquid cooling.

This server is part of Supermicro’s H14 line and bears the model number AS -2126FT-HE-LCC. It’s a high-performance, hot-swappable and high-density compute system.

Intended users include oil & gas companies, climate and weather modelers, manufacturers, scientific researchers and research labs. In short, anyone who requires high-performance computing (HPC).

Each 2U system comprises four nodes. And each node, in turn, is powered by a pair of 5th Gen AMD EPYC 9005 processors. (The previous-gen AMD EPYC 9004 processors are supported, too.)

Memory on this Supermicro FlexTwin maxes out at 9TB of DDR5, courtesy of up to 24 DIMM slots. Expansions connect via PCIe 5.0, with one slot per node the standard and more available as an option.

The 5th Gen AMD EPYC processors, introduced last month, are designed for data center, AI and cloud customers. The series launched with over 25 SKUs offering up to 192 cores and all using AMD’s new “Zen 5” or “Zen 5c” architectures.

Keeping Cool

To keep things chill, the Supermicro FlexTwin server is available with liquid cooling only. This allows the server to be used for HPC, electronic design automation (EDA) and other demanding workloads.

More specifically, the FlexTwin server uses a direct-to-chip (D2C) cold plate liquid cooling setup, and each system also runs 16 counter-rotating fans. Supermicro says this cooling arrangement can remove up to 90% of server-generated heat.

The server’s liquid cooling also covers the 5th gen AMD processors’ more demanding cooling requirements; they’re rated at up to 500W of thermal design power (TDP). By comparison, some members of the previous, 4th gen AMD EPYC processors have a default TDP as low as 200W.

Build & Recycle

The Supermicro FlexTwin server also adheres to the company’s “Building Block Solutions” approach. Essentially, this means end users purchase these servers by the rack.

Supermicro says its Building Blocks let users optimize for their exact workload. Users also gain efficient upgrading and scaling.

Looking even further into the future, once these servers are ready for an upgrade, they can be recycled through the Supermicro recycling program.

In Europe, Supermicro follows the EU’s Waste Electrical and Electronic Equipment (WEEE) Directive. In the U.S., recycling is free in California; users in other states may have to pay a shipping charge.

Put it all together, and you’ve got a server of the future, available to order today.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

AMD intros CPUs, accelerators, networking for end-to-end AI infrastructure -- and Supermicro supports

Featured content

AMD intros CPUs, accelerators, networking for end-to-end AI infrastructure -- and Supermicro supports

AMD expanded its end-to-end AI infrastructure products for data centers with new CPUs, accelerators and network controllers. And Supermicro is already offering supporting servers. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD today held a roughly two-hour conference in San Francisco during which CEO Lisa Su and other executives introduced a new generation of server processors, the next model in the Instinct MI300 Accelerator family, and new data-center networking devices.

As CEO Su told the audience the live and online audience, AMD is committed to offering end-to-end AI infrastructure products and solutions in an open, partner-dependent ecosystem.

Su further explained that AMD’s new AI strategy has 4 main goals:

  • Become the leader in end-to-end AI
  • Create an open AI software platform of libraries and models
  • Co-innovate with partners including cloud providers, OEMs and software creators
  • Offer all the pieces needed for a total AI solution, all the way from chips to racks to clusters and even entire data centers.

And here’s a look at the new data-center hardware AMD announced today.

5th Gen AMD EPYC CPUs

The EPYC line, originally launched in 2017, has become a big success for AMD. As Su told the event audience, there are now more than 950 EPYC instances at the largest cloud providers; also, AMD hardware partners now offer EPYC processors on more than 350 platforms. Market share is up, too: Nearly one in three servers worldwide (34%) now run on EPYC, Su said.

The new EPYC processors, formerly codenamed Turin and now known as the AMD EPYC 9005 Series, are now available for data center, AI and cloud customers.

The new CPUs also have a new core architecture known as Zen5. AMD says Zen5 outperforms the previous Zen4 generation by 17% on enterprise instructions-per-clock and up to 37% on AI and HPC workloads.

The new 5th Gen line has over 25 SKUs, and core count ranges widely, from as few as 8 to as many as 192. For example, the new AMD EPYC 9575F is a 65-core, 5GHz CPU designed specifically for GPU-powered AI solutions.

AMD Instinct MI325X Accelerator

About a year ago, AMD introduced the Instinct MI300 Accelerators, and since then the company committed itself to introducing new models on a yearly cadence. Sure enough, today Lisa Su introduced the newest model, the AMD Instinct MI325X Accelerator.

Designed for Generative AI performance and built on the AMD CDNA3 architecture, the new accelerator offers up to 256GB of HBM3E memory, and bandwidth up to 6TB/sec.

Shipments of the MI325X are set to begin in this year’s fourth quarter. Partner systems with the new AMD accelerator are expected to start shipping in next year’s first quarter.

Su also mentioned the next model in the line, the AMD Instinct MI350, which will offer up to 288GB of HBM3E memory. It’s set to be formally announced in the second half of next year.

Networking Devices

Forrest Norrod, AMD’s head of data-center solutions, introduced two networking devices designed for data centers running AI workloads.

The AMD Pensando Salina DPU is designed for front-end connectivity. It supports thruput of up to 400 Gbps.

The AMD Pensando Pollara 400, designed for back-end networks connecting multiple GPUs, is the industry’s first Ultra-Ethernet Consortium-ready AI NIC.

Both parts are sampling with customers now, and AMD expects to start general shipments in next year’s first half.

Both devices are needed, Norrod said, because AI dramatically raises networking demands. He cited studies showing that connectivity currently accounts for 40% to 75% of the time needed to run certain AI training and inference models.

Supermicro Support

Supermicro is among the AMD partners already ready with systems based on the new AMD processors and accelerator.

Wasting no time, Supermicro today announced new H14 series servers, including both Hyper and FlexTwin systems, that support the 5th gen AMD 9005 EPYC processors and AMD Instinct MI325X Accelerators.

The Supermicro H14 family includes three systems for AI training and inference workloads. Supermicro says the systems can also accommodate the higher thermal requirements of the new AMD EPYC processors, which are rated at up to 500W. Liquid cooling is an option, too.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Developing AI and HPC solutions? Check out the new AMD ROCm 6.2 release

Featured content

Developing AI and HPC solutions? Check out the new AMD ROCm 6.2 release

The latest release of AMD’s free and open software stack for developing AI and HPC solutions delivers 5 important enhancements. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

If you develop AI and HPC solutions, you’ll want to know about the most recent release of AMD ROCm software, version 6.2.

ROCm, in case you’re unfamiliar with it, is AMD’s free and open software stack. It’s aimed at developers of artificial intelligence and high-performance computing (HPC) solutions on AMD Instinct accelerators. It's also great for developing AI and HPC solutions on AMD Instinct-powered servers from Supermicro. 

First introduced in 2016, ROCm open software now includes programming models, tools, compilers, libraries, runtimes and APIs for GPU programming.

ROCm version 6.2, announced recently by AMD, delivers 5 key enhancements:

  • Improved vLLM support 
  • Boosted memory efficiency & performance with Bitsandbytes
  • New Offline Installer Creator
  • New Omnitrace & Omniperf Profiler Tools (beta)
  • Broader FP8 support

Let’s look at each separately and in more detail.

LLM support

To enhance the efficiency and scalability of its Instinct accelerators, AMD is expanding vLLM support. vLLM is an easy-to-use library for the large language models (LLMs) that power Generative AI.

ROCm 6.2 lets AMD Instinct developers integrate vLLM into their AI pipelines. The benefits include improved performance and efficiency.

Bitsandbytes

Developers can now integrate Bitsandbytes with ROCm for AI model training and inference, reducing their memory and hardware requirements on AMD Instinct accelerators. 

Bitsandbytes is an open source Python library that enables LLMs while boosting memory efficiency and performance. AMD says this will let AI developers work with larger models on limited hardware, broadening access, saving costs and expanding opportunities for innovation.

Offline Installer Creator

The new ROCm Offline Installer Creator aims to simplify the installation process. This tool creates a single installer file that includes all necessary dependencies.

That makes deployment straightforward with a user-friendly GUI that allows easy selection of ROCm components and versions.

As the name implies, the Offline Installer Creator can be used on developer systems that lack internet access.

Omnitrace and Omniperf Profiler

The new Omnitrace and Omniperf Profiler Tools, both now in beta release, provide comprehensive performance analysis and a streamlined development workflow.

Omnitrace offers a holistic view of system performance across CPUs, GPUs, NICs and network fabrics. This helps developers ID and address bottlenecks.

Omniperf delivers detailed GPU kernel analysis for fine-tuning.

Together, these tools help to ensure efficient use of developer resources, leading to faster AI training, AI inference and HPC simulations.

FP8 Support

Broader FP8 support can improve the performance of AI inferencing.

FP8 is an 8-bit floating point format that provides a common, interchangeable format for both AI training and inference. It lets AI models operate and perform consistently across hardware platforms.

In ROCm, FP8 support improves the process of running AI models, particularly in inferencing. It does this by addressing key challenges such as the memory bottlenecks and high latency associated with higher-precision formats. In addition, FP8's reduced precision calculations can decrease the latency involved in data transfers and computations, losing little to no accuracy.  

ROCm 6.2 expands FP8 support across its ecosystem, from frameworks to libraries and more, enhancing performance and efficiency.

Do More:

Watch the related video podcast:

Featured videos


Events


Find AMD & Supermicro Elsewhere

Which media server should you use when you absolutely can’t lose data?

Featured content

Which media server should you use when you absolutely can’t lose data?

A new Linus Tech Tip video shows a real-world implementation of Supermicro storage servers powered by AMD EPYC processors to provide super-high reliability.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Are your customers looking for a top-performing media server? And are you looking for a surprisingly entertaining video review of the best one? Then look no further. You’ll find both in the latest Linus Tech Tip video.

This episode, sponsored by Supermicro, is entitled “This Server CANNOT Lose Data.” That gives you an idea of its primary focus: high reliability.

And that reliability is delivered courtesy of a sophisticated server/storage cluster featuring Supermicro GrandTwin A+ multinode servers.

Myriad redundancies

What makes the GrandTwin so reliable? Redundancy. As video host Linus Sebastian exclaims, “Inside this 2U are 4 independent computers!”

Each computer, or node, is powered by a 2.45GHz AMD EPYC processor with up to 128 cores and a 256MB L3 cache. Each node also has 4 front hot-swap 2.5-inch drive bays that can hold petabytes of either NVMe or SATA storage.

The GrandTwin’s nodes can handle up to 3TB of DDR5 ECC server memory. They also have dual M.2 slots for boot drives and 6 PCIe Gen 5 x16 slots for networking, graphics and other expansion cards.

GrandTwin’s high-availability design extends all the way down to its dual power supplies. To ensure the system always has a reliable flow of power to all its vital components, Supermicro added two redundant 2200-watt titanium-level PSUs.

Handling the heat generated by this monster machine is paramount. The GrandTwin takes care of all that hot air via 4 high-speed fans—one fan in each PSU, plus 2 dedicated heavy-duty 8-cm. fans spinning at more than 17,000 RPM.

Prime processing

At the core of each of the GrandTwin’s 4 nodes is an AMD 9004-series processor. Linus’ prized media server, known as “Whonnock 10,” sports an AMD EPYC 9534 CPU in each node.

The EPYC 9534’s cores—there are 64 of them—operate at 2.45GHz and can boost up to 3.7GHz. And because each EPYC processor boasts 12 memory channels, the GrandTwin can address up to 12TB of memory systemwide.

Don’t call it overkill

As Linus says with unbridled enthusiasm, when it comes to redundancy, the name of the game is avoiding “split brain.”

The dreaded split brain can occur when redundant servers have their own object storage. The failure of even a single system can lead to a situation in which each server believes it has the correct data.

If there are only 2 servers, proving which system is correct is impossible. On the other hand, operating 3 or more servers allows the system to resolve the argument and determine the correct data.

Linus and company installed 2 GrandTwin A+ servers. That gives them the 8 redundant systems recommended by their preferred NVMe file system, WEKA.

A multitude of use cases

Your customers may have to contend with thousands of hours of high-resolution videos, like Linus and his cohorts. Or they may develop AI-enabled applications, provide cloud gaming, or host mission-critical web applications.

Whatever the use case, they can benefit from high-reliability servers designed with built-in redundancies. When failure is not an option, your customers need a server that, as the video puts it, “CANNOT lose data.”

That means helping your customers deploy Supermicro GrandTwin A+ servers powered by AMD EPYC processors. It’s the ultimate high-reliability system.

After all, as Linus says, “You only server once.”

Do more:

 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Supermicro expands rack capacity so you get servers faster & greener

Featured content

Supermicro expands rack capacity so you get servers faster & greener

Supermicro recently announced that it has expanded its capacity and can now provide 5,000 fully integrated, liquid-cooled racks per month. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Would you and your customers like to get faster delivery of Supermicro rackmount systems while also helping the environment?

Now you can.

Supermicro recently announced that it has expanded its capacity and can now provide 5,000 fully integrated, liquid-cooled racks per month. That’s because Supermicro now has integration facilities in four countries: the United States, Taiwan, Netherlands and Malaysia.

Supermicro also keeps in stock a certain number of commonly ordered rack configurations, what the company calls “golden SKUs.”

Between those systems and the company’s global locations, Supermicro can now deliver its rackmount systems both faster and over shorter distances. For example, Supermicro could ship a system to a customer in, say, Michigan from its Silicon Valley facility rather than from halfway around the world from Taiwan.

That shorter shipping distance also means less fuel needed and less polluting greenhouse gas produced. That’s an environmental win-win.

Get rolling with a rack

You can rely on Supermicro for data center IT solutions including on-site delivery, deployment, integration and benchmarking to achieve optimal operational efficiency.

Here’s how Supermicro’s rack delivery works in 3 steps:

Step 1: You start with proven reference designs for rapid installation while considering your clients' unique business objectives.

Step 2: You then work collaboratively with Supermicro-qualified experts to design optimized solutions for specific workloads. A prototype is designed and created for small-scale testing.

Step 3: Upon delivery, the racks need only be connected to power, networking and the liquid-cooling infrastructure. In other words, it’s a nearly seamless plug-and-play methodology.

Two areas of special interest for Supermicro are AI and liquid cooling. For AI, Supermicro plans to support AMD’s forthcoming MI300X GPU/CPU system, expected to be formally announced later this year. As for liquid cooling, it’s a technology Supermicro expects will soon be adopted by as many as 1 in 5 data centers worldwide as CPUs and GPUs continue to get hotter.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Pages