Sponsored by:

Visit AMD Visit Supermicro

Capture the full potential of IT

Tech Explainer: What’s a short-depth server?

Featured content

Tech Explainer: What’s a short-depth server?

Do your customer have locations that need server compute power, but lack data centers? Short-depth servers to the rescue!

Learn More about this topic
  • Applications:
  • Featured Technologies:

There are times when a standard-sized server just won’t do. Maybe your customer’s branch office or retail store has space constraints. Maybe they have concerns over portability. Or maybe their sustainability goals demand a solution that requires low power and efficient cooling.

For these and other related situations, short-depth servers can fit the bill. These relatively diminutive boxes are designed for use in less-than-ideal physical spaces that nevertheless demand high-performance IT infrastructure.

What kinds of organizations could benefit from short-depth server? Consider your local retail store. It’s likely been laid out using a calculus that prioritizes profit per square inch. This means the store’s best spots are dedicated to attracting buyers and generating revenue.

While that’s smart in terms of retail finance, it may not leave much room for vital infrastructure. That includes the servers that power the store’s point of sale (POS), security, advertising and data-collection systems.

This is a case where short-depth servers can help. These systems provide high levels of compute, storage and networking—without needing tall data center racks, elaborate cooling systems or other supporting infrastructure.

Other good candidates for using short-depth servers include remote branch offices, telco edge installations and industrial environments. In other words, any location that needs enterprise-level servers, but is short on space.

Small but Mighty

What’s more, today’s short-depth servers can handle some serious workloads.

Consider, for instance, the Supermicro WIO A+ Server (AS -1115SV-WTNRT), powered by AMD EPYC 8004 series processors. This short-depth server is engineered to tackle a variety of workloads, including virtualization, firewall applications, database, storage, edge and cloud computing.

The WIO A+ ships as a 1U form factor with a depth of just 23.5 inches. Compared with one of Supermicro’s big 8U multi-GPU servers, which has a depth of more than 33 inches, the short-depth server is short indeed.

Yet despite its diminutive size, this Supermicro server is packed with a ton of power—and room to grow. A single AMD EPYC processor sits at the center of the action, aided by either one double-width or two single-width GPUs.

This server also has room for up to 768GB of ECC DDR5 memory. And it can accommodate up to 10 hot-swap drives for NVMe, SAS or SATA storage.

As if that weren’t enough, Supermicro also includes room in this server cabinet for two PCIe 5.0 x16 full-height, full-length (FHFL) expansion cards. There’s also space for a single PCIe 5.0 x16 low-profile (LP) card.

More Power for Smaller Space

Fitting enough tech into a short-depth server can be a challenge. To do this, Supermicro’s designers had a few tricks up their sleeves.

For one, they used a custom motherboard instead of the more common ATX or EEB types. This creates more space in the smaller chassis. It also lets the designers employ a high-density component layout. The processors, GPUs, drives and other elements are placed closer to each other than they could be in a standard server.

Supermicro’s designers also deployed low-profile heat sinks. These use pipes that direct the heat toward fans. To save space, the fans are smaller than usual, but make up the difference by running faster. Sure, faster fans can create more noise. But it’s a worthy trade-off to avoid system failure due to overheating.

Are there downsides to the smaller form factor? There can be. For one, constrained airflow could force a system to throttle both processor and GPU performance in an effort to prevent heat-related issues. This could be an issue when running highly resource-intensive VM workloads.

For another, the smaller power supply units (PSUs) used in many short-depth servers may necessitate a less-powerful configuration than a user might prefer. For example, Supermicro’s short-depth server includes two 860-watt power supplies. That’s far less available power than the company’s multi-GPU powerhouse, which comes with six 5,250-watt PSUs. Of course, from another perspective, the need for less power can be seen as a benefit, especially at remote edge locations.

Short-depth servers represent a useful trade-off. While they give up some power and expandability, their reduced sizes can help IT pros make the most of tight spaces.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

How Supermicro/AMD servers boost AI boost performance with MangoBoost

Featured content

How Supermicro/AMD servers boost AI boost performance with MangoBoost

Supermicro and MangoBoost are together delivering an optimized end-to-end GenAI stack. It’s based on Supermicro servers powered by AMD Instinct GPUs and running MangoBoost’s LLMBoost software.

Learn More about this topic
  • Applications:
  • Featured Technologies:

While many organizations are implementing AI for business, many are also discovering that deploying and operating large language models (LLMs) at scale isn’t easy.

They’re finding that the hardware demands are intense. And so are the performance and cost trade-offs. Also, with AI workloads increasingly demanding multi-node GPU clusters, orchestration and tuning can be complex.

To address these challenges, Supermicro and MangoBoost Inc. are working together to deliver an optimized end-to-end GenAI stack. They’ve combined Supermicro’s robust AMD Instinct GPU server portfolio with MangoBoost’s LLMBoost software.

Meet MangoBoost

If you’re unfamiliar with MangoBoost, the company offers programmable solutions that improve data-center application performance while lowering CPU overhead. MangoBoost was founded three years ago; today it operates in the United States, Canada and South Korea.

MangoBoost’s core product is called the Data Processing Unit. It ensures full compatibility with general-purpose GPUs, accelerators and storage devices, enabling cost-efficient and standardized AI infrastructures.

MangoBoost also offers a ready-to-deploy, full-stack AI inference server. Known as Mango LLMBoost, it’s available from the Big Three cloud providers—AWS, Microsoft Azure and Google Cloud.

LLMBoost helps organizations accelerate both the training and deploying LLM at scale. Why is this so challenging? Because once a model is ready for inference, developers face what’s known as a “productization tax.”

Integrating the machine-learning processing pipeline into the rest of the application often requires additional time and engineering effort. And this can lead to delays.

Mango LLMBoost addresses these challenges by creating an easy-to-use container. This lets LLM experts optimize their models, then select suitable GPUs on demand.

MangoBoost’s inference engine uses three forms of GPU parallelism, allowing GPUs to balance their compute, memory and network-resource usage. In addition, the software’s intelligent job scheduling optimizes cluster-wide GPU resources, ensuring that the load is balanced equally across GPU nodes.

LLMBoost also ensures the effective use of low-latency GPU caches and high-bandwidth memory through quantization. This reduces the data footprint, but without lowering accuracy.

Complementing Hardware

MangoBoost’s LLMBoost software complements the powerful hardware with a full-stack, production-ready AI MLOps platform. It includes:

  • Plug-and-play deployment: Pre-built Docker images and an intuitive command-line interface (CLI) both help developers to launch LLM workloads quickly.
  • OpenAI-compatible API: Lets developers integrate LLM endpoints with minimal code changes.
  • Kubernetes-native orchestration: Provides automated deployment and management of autoscaling, load balancing and job scheduling for seamless operation across both single- and multi-node clusters.
  • Full-stack performance auto-tuning: Unlike conventional auto-tuners that handle model hyper-parameters only, LLMBoost optimizes every layer from the inference and training back-ends to network configurations and GPU runtime parameters. This ensures maximum hardware utilization, yet without requiring any manual tuning.

Proof of Performance

Supermicro and MangoBoost collaborating to deliver an optimized end-to-end Generative AI stack sounds good. But how does the combined solution actually perform?

To find out, Supermicro, AMD and MangoBoost recently tested their combined solution using real-world GenAI workloads. Here are the results:

  • LLMBoost reduced training time by 40% for two-node training, down to 13.3 minutes on a server built around a dual-node AMD Instinct MI325X. The training was done running Llama 2 70B, an LLM with 70 billion parameters, with LoRA (low-rank adaptation).
  • LLMBoost achieved a 1.96X higher throughput for multiple-node inference on Supermicro AMD servers. That was up to over 61,000 tokens/sec. on a dual-node AMD Instinct MI325X configuration.
  • In-house LLM inference with Llama 4 Maverick and Scout models achieved near-linear scaling on AMD Instinct MI325X nodes. (Maverick is designed for fast responses at low cost; Scout, for long-document analysis.) This shows that Supermicro systems are ready for real-time GenAI deployment.
  • Load balancing: The researchers used LLaVA, an image-capturing model, on three setups. The heterogeneous dual-node configuration—eight AMD Instinct MI300X GPUs and eight AMD Instinct MI325X GPUs—achieved 96% of the sum of individual single-node runs. This demonstrates minimal overhead and high efficiency.

Are your customers looking for a turnkey GenAI cluster solution that’s high-performance, flexible and easy to operate? Then tell them that Supermicro, AMD and MangoBoost have their solution—and the proof that it works.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Need AI for financial services? Supermicro and AMD have your solution

Featured content

Need AI for financial services? Supermicro and AMD have your solution

Financial services companies are making big investments in AI. To speed their time to leadership, Supermicro and AMD are partnering to deliver advanced computing systems.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Financial services companies earn their keep by investing in stocks, bonds and other financial instruments. Now these companies are also making big investments in artificial intelligence technology.

To help these financial services industry (FSI) players adopt AI, Supermicro and AMD are working together. The two are partnering to offer advanced computing solutions designed to empower and speed the finance industry’s move to technology and business leadership.

FSI companies can use these systems to:

  • Detect risks faster, uncovering patterns and anomalies by ingesting ever-larger data sets
  • Supercharge trading with AI in both the front- and back-office
  • Modernize core processes to lower costs while boosting resilience
  • Engage and delight customers by meeting—even exceeding—their expectations

Big Spenders

Already, FSI spending on AI technology is substantial. Last year, when management consulting firm Bain & Co. surveyed nearly 110 U.S. FSI firms, it found that those respondents with annual revenue of at least $5 billion were spending an average of $221 million on AI.

The companies were getting a good return on AI, too. Bain found that 75% of financial services companies said their generative AI initiatives were either achieving or exceeding their expected value. In addition, the GenAI users reported an average productivity gain across all uses of an impressive 20%.

Based on those findings, Bain estimates that by embracing AI, FSI firms can reduce their customer-service costs by 20% to 30% while increasing their revenue by about 5%. 

Electric Companies

One big issue facing all users of AI is meeting the technology’s energy needs. Power consumption is a big-ticket item, accounting for about 40% of all data center costs, according to professional services firm Deloitte.

Greater AI adoption could push that even higher. Deloitte believes global data center electric consumption could double by as soon as 2030, driven by big increases in GenAI training and inference.

As Deloitte points out, some of that will be the result of new hardware requirements. While general-purpose data center CPUs typically run at 150 to 200 watts per chip, the GPUs used for AI run at up to 1,200 watts per chip.

This can also increase the power demand per rack. As of early 2024, data centers typically supported rack power requirements of at least 20 kilowatts, Deloitte says. But with growth of GenAI, that’s expected to reach 50 kilowatts per rack by 2027.

That growth is almost sure to come. Market watcher Grand View Research expects the global market for GPUs in data centers of all industries to rise over the next eight years at a compound annual growth rate (CAGR) of nearly 36%. That translates into data-center GPU sales leaping from $14.48 billion worldwide last year to $190.1 billion in 2033, Grand View predicts.

Partner Power

FSI companies don’t have to meet these challenges alone. Supermicro and AMD have partnered to deliver advanced computing systems that deliver high levels of compute performance and flexibility, yet with a comparatively low total cost of ownership (TCO).

They’re boosting performance with high-performing, dense 4U servers using the latest AMD EPYC CPUs and AMD Instinct GPUs. Some of these servers offer up to 60 storage drive bays, 9TB of DDR5 RAM and 192 CPU cores.

For AI workloads, AMD offers the AMD EPYC 9575F AI host node. It has 64 cores and a maximum boost frequency of up to 5 GHz.

Flexibility is another benefit. Supermicro offers modular Datacenter Building Block Solutions. These include system-level units that have been pre-validated to ease the task of data-center design, among other offerings.

AMD and Supermicro are also offering efficiencies that lower the cost of transforming with AI. Supermicro’s liquid cooling slashes the total cost of ownership (TCO). AMD processors are designed for power efficiency. And SMC’s multi-mode design gives you more processing capability per rack.

Are you working with FSI customers looking to lead the way with AI investments? The latest Supermicro servers powered by AMD CPUs and GPUs have your back.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Validate, test and benchmark the latest AMD-powered servers with Supermicro JumpStart

Featured content

Validate, test and benchmark the latest AMD-powered servers with Supermicro JumpStart

Get a free test drive on cutting-edge Supermicro servers powered by the latest AMD CPUs and GPUs.

Learn More about this topic
  • Applications:
  • Featured Technologies:

How would you like free access to Supermicro’s first-to-market, high-end H14 servers powered by the latest AMD EPYC CPUs and Instinct GPUs?

Now it’s yours via your browser—and the Supermicro JumpStart program.

JumpStart offers you remote access to Supermicro servers. There, you can validate, test and benchmark your workloads. And assuming you qualify, using JumpStart is absolutely free.

While JumpStart has been around for some time, Supermicro has recently refreshed the program by including some of its latest H14 servers:

  • 8U server with eight AMD Instinct MI325X GPUs, dual AMD EPYC 9005 Series CPUs, 2TB of HBM3 memory (Supermicro model AS -8126GS-TNMR)
  • 2U server with dual AMD EPYC 9005 Series processors and up to 1.5TB of DDR5 memory (AS -2126HS-TN).
  • 1U cloud server with a single AMD EPYC 9005 Series processor (AS -1116CS-TN)

Supermicro has also updated JumpStart systems with its 1U E3.S all-Flash storage systems powered by a single AMD EPYC processor, so you can also test-drive the latest PCIe drives. Also, several of Supermicro’s H13 AMD-powered are available for remote access on JumpStart, as well.

How It Works

Getting started with JumpStart is easy:

Step 1: On the main JumpStart page, browse the available systems, then click the “get access” or “request access” button for the system you want to try. Then select your preferred system and time slot.

Step 2: Sign in. You can either login with your Supermicro single sign-on (SSO) account or create a new free account. Supermicro will then qualify your account and reach out with further instructions.

Step 3: When your chosen time arrives, secure access to your system. Most JumpStart sessions last for one week. If you need more time, that can often be negotiated with your Supermicro sales reps.

It's that simple.

Once you’re connected to a server via JumpStart, you can have up to three sessions open: one VNC (virtual network computing), one SSH (secure shell), and one IPMI (intelligent platform management interface).

JumpStart also protects your privacy. After your JumpStart trial is completed, the server and storage devices are manually erased. In addition, the BIOS and firmware are reflashed, and the operating system is re-installed with new credentials.

More protection is offered, too. A jump server is used as a proxy. This means that the server you’re testing can use the internet to get files, but it is not directly addressable via the internet.

That said, it’s recommended that you do not use the test servers for processing sensitive or confidential data. Instead, Supermicro advises the use of anonymized data only—mainly because the servers may follow security policies that differ from your own.

So what are you waiting for? Try out JumpStart and get free remote access to Supermicro’s cutting-edge servers powered by the latest AMD CPUs and GPUs.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Tech Explainer: What is agentic AI?

Featured content

Tech Explainer: What is agentic AI?

Find out how new artificial intelligence systems can make decisions and take actions autonomously—that is, without human intervention.

Learn More about this topic
  • Applications:
  • Featured Technologies:

We’re on the precipice of a major AI evolution. Welcome to the era of agentic AI.

The official definition of agentic AI is artificial intelligence capable of making autonomous decisions. That is, without human oversight or intervention.

You can imagine agentic AI as a robot on a mission. This robot has been designed to think like a human. Give it a goal, and the robot can then evaluate the ongoing situation, reacting intelligently in pursuit of that defined goal.

For example, imagine you’re planning a visit to wineries in California’s Napa Valley. A standard AI chatbot like ChatGPT could help you find the closest airport with car-rental agencies, identify which airlines fly there, and locate nearby hotels. But it would still be up to you to compare prices and actually make the reservations.

But what if instead, your robot could autonomously plan—and book!—the entire trip based on your preferences? For example, you might engage an agentic AI like AutoGPT by telling it something like this:

“I want to go to Napa Valley and visit wineries. I don’t want to spend more than $3,000. I prefer Chardonnay and Syrah wines. I once had a bad experience with American Airlines. It would be fun to drive a convertible. A 3-star hotel is fine as long as it’s got good reviews.”

The promise of agentic AI is that it would use that information to plan and book your trip. The agentic AI would find you the best flight, car and hotel by interacting with each company’s APIs or even their own agentic AI—here referred to as “other agents.” This is also known as machine-to-machine (M2M) communications.

Your robot agent could also make your reservations at vineyards with critically acclaimed Chardonnay and Syrah wines. And it might even plan your route using details as granular as the range of the discounted rag-top Ford Mustang it found near the airport.

Agentic AI for Organizations

This personal Napa Valley scenario is one of those nice-to-have kinds of things. But for organizations, agentic AI has far more potential. This technology could eventually transform every major industry and vertical market.

For example, a retailer might use agentic AI to autonomously adjust a product’s price based on the current inventory level, availability and competitive brands.

A manufacturer could use an AI agent to manage procurement and create dynamic forecasting, saving the company time and money.

And in the public sector, agentic AI could help a government agency better respond to public-health emergencies like the next global pandemic. The AI could model viral transmission patterns, then send additional resources to the areas that need them the most.

In each case, we’re talking about the potential for a tireless virtual robot workforce. Once you give an agentic AI a mission, it can proceed without any further human intervention, saving you countless hours and dollars.

Training: Standard AI vs. Agentic

For all types of AI, one big issue is training. That’s because an AI system on its own doesn’t really know anything. To be useful, it first has to be trained.

And with training, there’s a huge difference between the way you train a standard AI and the way you train an AI that’s agentic. It’s as dramatic as the difference between programming a calculator and onboarding a new (human) intern.

With a standard AI chatbot, the system is trained to answer questions based on a relatively narrow set of parameters. To accomplish this, engineers provide massive amounts of data via large language models (LLMs). They then train the bot through supervised learning. Eventually, inferencing enables the AI to make predictions based on user input and available data.

By contrast, training an agentic AI focuses on memory, autonomy, planning and using available tools. Here, LLMs are paired with prompt engineering, long-term memory systems and feedback loops. These elements work together to create a type of intelligent thought process—the kind you hope your new intern is capable of!

Then, at the inferencing stage, the AI does far more than just answer questions. Instead, agentic AI inferencing enables the system to interpret goals, create plans, ask for help and, ultimately, execute tasks autonomously.

Nuts and Bolts

The IT infrastructure that powers agentic AI is no different from the horsepower behind your average chatbot. There’s just a lot more of it.

That’s because agentic AI, in comparison with standard AI, makes more inference calls, reads and writes more files, and queries more APIs. It also engages a persistent memory. That way, the AI can continuously access collected information as it works towards its goals.

However, having a slew of GPUs and endless solid-state storage won’t be enough to sustain what will likely be the meteoric growth of this cutting-edge technology. As agentic AI becomes more vital, IT managers will need a way to feed the fast-growing beast.

Supermicro’s current H14 systems—they include the GPU A+ Server—are powered by AMD EPYC 9005-series processors and fitted with up to 8 AMD Instinct MI325X Accelerators. Supermicro has designed these high-performance solutions to tackle the most challenging AI workloads.

Looking ahead, at AMD’s recent “Advancing AI” event, CEO Lisa Su introduced Helios, AMD’s vision for agentic AI infrastructure. Su said Helios will deliver the compute density, memory bandwidth, performance and scale-out bandwidth needed for the most demanding AI workloads. What’s more, Helios will come packaged as a ready-to-deploy AI rack solution that accelerates users’ time to market.

Helios, planned for release in 2026, will use several forthcoming products: AMD Instinct MI400 GPUs, AMD 6th Gen EPYC CPUs, and AMD Pensando “Vulcano” network interface cards (NICs). All will be integrated in an OCP-compliant rack that supports both UALink and Ultra Ethernet. And eventually, Helios will appear in turnkey systems such as the Supermicro H14 series.

What’s Next?

What else does agentic AI have in store for us? While no one has a crystal ball, it’s reasonable to assume we’ll see increasingly sophisticated agents infiltrating nearly every aspect of our lives.

For instance, agentic AI could eventually develop the ability to work autonomously on long-term, multifaceted projects—everything from advertising campaigns to biomedical research.

Agentic AI is also likely to learn how to debug its own logic and develop new tools. These capabilities are referred to by the pros as self-reflection and self-improvement, respectively.

One day in the not-too-distant future, we could even see massive teams of specialized AI agents working together under a single robotic project manager.

Think this is starting to sound like “The Matrix”? You ain’t seen nothin’ yet.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Deploy GenAI with confidence: Validated Server Designs from Supermicro and AMD

Featured content

Deploy GenAI with confidence: Validated Server Designs from Supermicro and AMD

Learn about the new Validated Design for AI clusters from Supermicro and AMD. It can save you time, reduce complexity and improve your ROI.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The task of designing, building and connecting a server system that can run today’s artificial intelligence workloads is daunting.

Mainly, because there are a lot of moving parts. Assembling and connecting them all correctly is not only complicated, but also time-consuming.

Supermicro and AMD are here to help. They’ve recently co-published a Verified Design document that explains how to build an AI cluster. The PDF also tells you how you can acquire an AMD-powered Supermicro cluster for AI pre-built, with all elements connected, configured and burned in before shipping.

Full-Stack for GenAI

Supermicro and AMD are offering a fully validated, full-stack solution for today’s Generative AI workloads. The system’s scale can be easily adjusted from as few as 16 nodes to as many as 1,024—and points in between.

This Supermicro solution is based on three AMD elements: the AMD Instinct MI325X GPU, AMD Pensando Pollara 400 AI network interface card (NIC), and AMD EPYC CPU.

These three AMD parts are all integrated with Supermicro’s optimized servers. That includes network cabling and switching.

The new Validated Design document is designed to help potential buyers understand the joint AMD-Supermicro solution’s key elements. To shorten your implementation time, the document also provides an organized plan from start to finish.

Under the Cover

This comprehensive report—22 pages plus a lengthy appendix—goes into a lot of technical detail. That includes the traffic characteristics of AI training, impact of large “elephant” flows on the network fabric, and dynamic load balancing. Here’s a summary:

  • Foundations of AI Fabrics: Remote Direct Memory Access (RDMA), PCIe switching, Ethernet, IP and Border Gateway Protocol (BGP).
  • Validated Design Equipment and Configuration: Server options that optimize RDMA traffic with minimal distance, latency and silicon between the RDMA-capable NIC (RNIC) and accelerator.
  • Scaling Out the Accelerators with an Optimized Ethernet Fabric: Components and configurations including the AMD Pensando Pollara 400 Ethernet NIC and Supermicro’s own SSE-T8196 Ethernet switch.
  • Design of the Scale Unit—Scaling Out the Cluster: Designs are included for both air-cooled and liquid-cooled setups.
  • Resource Management and Adding Locality into Work Placement: Covering the Simple Linux Utility for Resource Management (SLURM) and topology optimization including the concept of rails.
  • Supermicro Validated AMD Instinct MI325 Design: Shows how you can scale the validated design all the way to 8,000 AMD MI325X GPUs in a cluster.
  • Storage Network Validated Design: Multiple alternatives are offered.
  • Importance of Automation: Human errors are, well, human. Automation can help with tasks including the production of detailed architectural drawings, output of cabling maps, and management of device firmware.
  • How to Minimize Deployment Time: Supermicro’s Rack Scale Solution Stack offers a fully integrated, end-to-end solution. And by offering a system that’s pre-validated, this also eases the complexity of multi-vendor integration.

Total Rack Solution

Looking to minimize implementation times? Supermicro offers a total rack scale solution that’s fully integrated and end-to-end.

This frees the user from having to integrate and validate a multi-vendor solution. Basically, Supermicro does it for you.

By leveraging industry-leading energy efficiency, liquid and air-cooled designs, and global logistics capabilities, Supermicro delivers a cost-effective and future-proof solution designed to meet the most demanding IT requirements.

The benefits to the customer include reduced operational overhead, a single point of accountability, streamlined procurement and deployment, and maximum return on investment.

For onsite deployment, Supermicro provides a turnkey, fully optimized rack solution that is ready to run. This helps organizations maximize efficiency, lower costs and ensure long-term reliability. It includes a dedicated on-site project manager.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Research Roundup: Cloud infrastructure, smart supply chains, augmented reality, AI tools

Featured content

Research Roundup: Cloud infrastructure, smart supply chains, augmented reality, AI tools

Get briefed on the latest IT market surveys, forecasts and analysis. 

Learn More about this topic
  • Applications:

Cloud infrastructure sales are booming. Most supply-chain managers don’t have an AI strategy yet. VR/AR is making a surprising comeback. And nearly half of U.S. adults use GenAI tools.

That’s some of the latest from leading IT market watchers, survey organizations and analysts. And here’s your research roundup. 

Cloud Infrastructure Booming

The market for cloud infrastructure services is robust, with global sales hitting $90.9 billion in this year’s first quarter, a year-on-year rise of 21%, finds market watcher Canalys.

What’s behind the boom? AI, mostly. Canalys says enterprises realize that to deploy AI applications, they first need to strengthen their cloud power.

Also, cloud providers are working to lower the cost of AI usage, in part by investing in infrastructure. In the year’s first quarter, the big three cloud-service providers—AWS, Microsoft Azure and Google Cloud—collectively increased their spending on cloud infrastructure by 24%, according to Canalys.

Few Supply Chains Have AI Strategies

While AI has the potential to transform supply chains, fewer than one in four supply-chain leaders (23%) have a formal AI strategy in place. So finds a new survey by research firm Gartner.

And that’s a problem, says Gartner researcher Benjamin Jury. “Without a structured approach,” he warns, “organizations risk creating inefficient systems that struggle to scale and adapt to evolving business demands.”

The Gartner survey was conducted earlier this year. It reached 120 supply-chain leaders who have deployed AI in their organizations within the last year.

How can supply-chain leaders do better with AI? Gartner recommends three moves:

  • Develop a formal supply-chain AI strategy. It should be both defined and documented.
  • Adopt a Run-Grow-Transform framework. By implementing projects in all three states, organizations can better allocate resources and deliver quick results.
  • Invest in AI-ready infrastructure. Do this in collaboration with the CIO and other executives.

Virtual Reality’s Comeback

Remember all the excitement about virtual and augmented reality? It’s back.

The global market for AR/VR headsets rebounded in this year’s first quarter, with unit shipments rising 18% year-on-year, according to research firm IDC.

Meta, which changed its name from Facebook in 2021 to reflect the shift, now leads the AR/VR business with a 51% market share, IDC finds.

What’s behind the VR comeback? “The market is clearly shifting toward more immersive and versatile experiences,” offers Jitesh Ubrani, an IDC research manager.

Ubrani and colleagues expect even bigger gains ahead. IDC predicts global sales of AR/VR headsets will more than double by 2026, rising from about 5 million units this year to more than 10 million units next year.

IDC also expects the market to shift away from AR and VR and instead toward mixed reality (MR) and extended reality (ER). MR appeals mainly to gamers and consumers. ER will be used for gaming, too, but it should also power smart glasses, enabling AI to assist tasks such as identifying objects in photos and providing instant language translations.

IDC predicts smart glasses will enjoy wide appeal among consumers and businesses alike. Just last week, Meta and sunglasses maker Oakley announced what they call Performance AI glasses, featuring a built-in camera and open-ear speakers.

Do You Use GenAI?

The chances either way are almost even. More than one in four U.S. adults (44%) do use Generative AI tools such as ChatGPT at least sometimes. But over half (56%) never use these tools or only rarely.

Similarly, U.S. adults are split on whether AI will make life better or worse: 42% believe AI will make their lives somewhat or much worse, while a very close 44% think AI will make their lives somewhat or much better.

These findings come from a new NBC News poll. Powered by Survey Monkey, the poll was conducted from May 30 to June 10, and it received responses from more than 19,400 U.S. adults.

Respondents were also evenly split when asked about the role of AI in schools. Slightly over half the respondents (53%) said integrating AI tools in the classroom would prepare students for the future. Conversely, nearly as many (47%) said they favor prohibiting AI in the classroom.

The NBC survey found that attitudes toward AI were unaffected by political leanings. The pollsters asked respondents whether they were Republicans, Democrats or Independents. Differences in responses by political leaning were mostly within the poll’s overall margin of error, which NBC News put at plus or minus 2.1%.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Tech Explainer: What’s special about an AI server?

Featured content

Tech Explainer: What’s special about an AI server?

What’s in an AI server that a general-purpose system lacks?

Learn More about this topic
  • Applications:
  • Featured Technologies:

The Era of Artificial Intelligence requires its own class of servers, and rightly so. The AI tech that increasingly powers our businesses, finance, entertainment and scientific research is some of the most resource-intensive in history. Without AI servers, all this would grind to a halt.

But why? What’s so special about AI servers? And how are they able to power successive evolutions of large language models, generative AI, machine learning, and all the other AI-based workloads we’ve come to rely on day in and day out?

Put another way: What do AI servers have that standard servers don’t?

The answer can be summed up in a single word: More.

When it comes to AI servers, it’s all about managing a symphony. The musical instruments include multiple processors, GPUs, memory modules, networking hardware and expansion options.

Sure, your average general-purpose server has many similar components. But both the quantity and performance of each component is considerably lower than those of an AI server. That helps keep the price affordable, heat low, and workload options open. But it certainly doesn’t have the integrated GPU needed to run AI workloads.

Best of the Beasts

Supermicro specializes in the deployment of jaw-dropping power. The company’s newest 8U GPU Server (AS -8126GS-TNMR) is engineered to chew through the world’s toughest AI workloads. It’s powered by dual AMD EPYC processors and eight AMD Instinct MI350X or Instinct MI325X accelerators. This server can tackle AI workloads while staying cool and scaling up to meet increasing demand.

Keeping AI servers from overheating can be a tough job. Even a lowly, multipurpose business server kicks off a lot of heat. Temperatures build up around vital components like the CPU, GPU and storage devices. If that heat hangs around too long, it can lead to performance issues and, eventually, system failure.

Preventing heat-related issues in a single general-purpose server can be accomplished with a few heatsinks and small-diameter fans. But when it comes to high-performance, multi-GPU servers like Supermicro’s new 4U GPU A+ Server (AS -4126GS-NMR-LCC), liquid cooling becomes a must-have.

It’s also vital that AI servers be designed with expansion in mind. When an AI-powered app becomes successful, IT managers must be able to scale up quickly and without interruption.

Supermicro’s H14 8U 8-GPU System sets the standard for scalability. The H14 offers up to 20 storage drives and up to 12 PCI Express 5.0 (PCIe) x16 expansion slots.

Users can fill these high-bandwidth slots with a dizzying array of optional hardware, including:

  • Network Interface Cards (NICs) like the new AI-focused AMD AI NIC for high-speed networking.
  • NVMe storage to provide fast disk access.
  • Field Programmable Gate Array (FPGA) modules, which can be set up for custom computation and reconfigured after deployment.
  • Monitoring and control management cards. These enable IT staff to power servers on and off remotely, and also access BIOS settings.
  • Additional GPUs to aid in AI training and inferencing.
  • AI Accelerators. The AMD Instinct series is designed to tackle computing for AI, both training and inference.

A Different Class of Silicon

Hardware like the Supermicro GPU Server epitomizes what it means to be an AI server. That’s due in part to the components it’s designed to house. We’re talking about some of the most advanced processing tech available today.

As mentioned above, that tech comes courtesy of AMD, whose 5th Gen AMD EPYC 9005 series processors and recently announced AMD Instinct MI350 Series GPUs are powerful enough to tackle any AI workload.

AMD’s Instinct MI350 accelerators deliver a 4x generation-on-generation AI compute increase and a 35x generational leap in inferencing.

Say the word, and Supermicro will pack your AI Server with dual AMD EPYC processors containing up to 192 cores. They’ll install the latest AMD Instinct M1350X platform with 8 GPUs, fill all 24 DIMM slots with 6TB of DDR5 memory, and add an astonishing 16 NVMe U.2 drives. 

Advances Just Around the Corner

It seems like each new day brings stories about bold advances in AI. Apparently, our new robot friends may have the answer to some very human questions like, how can we cure our most insidious diseases? And how do we deal with the looming threat of climate crisis?

The AI models that could answer those questions—not to mention the ones that will help us find even better movies on Netflix—will require more power as they grow.

To meet those demands, AI server engineers are already experimenting with the next generation of advanced cooling for dense GPU clusters, enhanced hardware-based security, and new, more scalable modular infrastructure.

In fact, AI server designers have begun using their own AI models to create bigger and better AI servers. How very meta.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Meet Supermicro’s newest AI servers, powered by AMD Instinct MI350 Series GPUs

Featured content

Meet Supermicro’s newest AI servers, powered by AMD Instinct MI350 Series GPUs

Supermicro’s new AI servers are powered by a combination of AMD EPYC CPUs and AMD Instinct GPUs.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Supermicro didn’t waste any time supporting AMD’s new Instinct MI350 Series GPUs. The same day AMD formally introduced the new GPUs, Supermicro announced two rack-mount servers that support them.

The new servers, members of Supermicro’s H14 generation of GPU optimized solutions, feature dual AMD EPYC 9005 CPUs along with the AMD Instinct MI350 series GPUs. They’re aimed at organizations looking to achieve a formerly tough combination: maximum performance at scale in their AI-driven data centers, but also a lower total cost of ownership (TCO).

To make the new servers easy to upgrade and scale, Supermicro has designed the new servers around its proven building-block architecture.

Here’s a quick look at the two new Supermicro servers:

4U liquid-cooled system with AMD Instinct MI355X GPU

This system, model number AS -4126GS-NMR-LCC, comes with a choice of dual AMD EPYC 9005 or 9004 Series CPUs, both with liquid cooling.

On the GPU front, users also have a choice of the AMD Instinct MI325X or brand-new AMD Instinct MI355X. Either way, this server can handle up to 8 GPUs.

Liquid cooling is provided by a single direct-to-chip cold plate. Further cooling comes from 5 heavy-duty fans and an air shroud.

8U air-cooled system with AMD Instinct MI350X GPU

This system, model number AS -8126GS-TNMR, comes with a choice of dual AMD EPYC 9005 or 9004 Series CPUs, both with air cooling.

This system also supports both the AMD Instinct MI325X and AMD Instinct MI350X GPUs. Also like the 4U server, this system supports up to 8 GPUs.

Air cooling is provided by 10 heavy-duty fans and an air shroud.

The two systems also share some features in common. These include PCIe 5.0 connectivity, large memory capacities (up to 2.3TB), and support for both AMD’s ROCm open-source software and AMD Infinity Fabric Link connections for GPUs.

“Supermicro continues to lead the industry with the most experience in delivering high-performance systems designed for AI and HPC applications,” says Charles Liang, president and CEO of Supermicro. “The addition of the new AMD Instinct MI350 series GPUs to our GPU server lineup strengthens and expands our industry-leading AI solutions and gives customers greater choice and better performance as they design and build the next generation of data centers.”

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

AMD presents its vision for the AI future: open, collaborative, for everyone

Featured content

AMD presents its vision for the AI future: open, collaborative, for everyone

Check out the highlights of AMD’s Advancing AI event—including new GPUs, software and developer resources.

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD advanced its AI vision at the “Advancing AI” event on June 12. The event, held live in the Silicon Valley city of San Jose, Calif., as well as online, featured presentations by top AMD executives and partners.

As many of the speakers made clear, AMD’s vision for AI is that it be open, developer-friendly, collaborative and useful to all.

AMD certainly believes the market opportunity is huge. During the day’s keynote, CEO Lisa Su said AMD now believes the total addressable market (TAM) for data-center AI will exceed $500 billion by as soon as 2028.

And that’s not all. Su also said she expects AI to move beyond the data center, finding new uses in edge computers, PCs, smartphone and other devices.

To deliver on this vision, Su explained, AMD is taking a three-pronged approach to AI:

  • Offer a broad portfolio of compute solutions.
  • Invest in an open development ecosystem.
  • Deliver full-stack solutions via investments and acquisitions.

The event, lasting over two hours, was also filled with announcements. Here are the highlights.

New: AMD Instinct MI350 Series

At the Advancing AI event, CEO Su formally announced the company’s AMD Instinct MI350 Series GPUs.

There are two models, the MI350X and MI355X. Though both are based on the same silicon, the MI355X supports higher thermals.

These GPUs, Su explained, are based on AMD’s 4th gen Instinct architecture, and each GPU comprises 10 chiplets containing a total of 185 billion transistors. The new Instinct solutions can be used for both AI training and AI inference, and they can also be configured in either liquid- or air-cooled systems.

Su said the MI355X delivers a massive 35x general increase in AI performance over the previous-generation Instinct MI300. For AI training, the Instinct MI355X offers up to 3x more throughput than the Instinct MI300. And in comparison with a leading competitive GPU, the new AMD GPU can create up to 40% more tokens per dollar.

AMD’s event also featured several representatives of companies already using AMD Instinct MI300 GPUs. They included Microsoft, Meta and Oracle.

Introducing ROCm 7 and AMD Developer Cloud

Vamsi Boppana, AMD’s senior VP of AI, announced ROCm 7, the latest version of AMD’s open-source AI software stack. ROCm 7 features improved support for industry-standard frameworks; expanded hardware compatibility; and new development tools, drivers, APIs and libraries to accelerate AI development and deployment.

Earlier in the day, CEO Su said AMD’s software efforts “are all about the developer experience.” To that end, Boppana introduced the AMD Developer Cloud, a new service designed for rapid, high-performance AI development.

He also said AMD is giving developers a 25-hour credit on the Developer Cloud with “no strings.” The new AMD Developer Cloud is generally available now.

Road Map: Instinct MI400, Helios rack, Venice CPU, Vulcano NIC

During the last segment of the AMD event, Su gave attendees a sneak peek at several forthcoming products:

  • Instinct MI400 Series: This GPU is being designed for both large-scale AI inference and training. It will be the heart of the Helios rack solution (see below) and provide what Su described as “the engine for the next generation of AI.” Expect performance of up to 40 petaflops, 432GB of HBM4 memory, and bandwidth of 19.6TB/sec.
  • Helios: The code name for a unified AI rack solution coming in 2026. As Su explained it, Helios will be a rack configuration that functions like a single AI engine, incorporating AMD’s EPYC CPU, Instinct GPU, Pensando Pollara network interface card (NIC) and ROCm software. Specs include up to 72 GPUs in a rack and 31TB of HBM3 memory.
  • Venice: This is the code name for the next generation of AMD EPYC server CPUs, Su said. They’ll be based on a 2nm form, feature up to 256 cores, and offer a 1.7x performance boost over the current generation.
  • Vulcano: A future NIC, it will be built using a 3nm form and feature speeds of up to 800Gb/sec.

Do More:

 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Related Content

Pages