Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Supermicro FlexTwin now supports 5th gen AMD EPYC CPUs

Featured content

Supermicro FlexTwin now supports 5th gen AMD EPYC CPUs

FlexTwin, part of Supermicro’s H14 server line, now supports the latest AMD EPYC processors — and keeps things chill with liquid cooling.

 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Wondering about the server of the future? It’s available for order now from Supermicro.

The company recently added support for the latest 5th Gen AMD EPYC 9005 Series processors on its 2U 4-node FlexTwin server with liquid cooling.

This server is part of Supermicro’s H14 line and bears the model number AS -2126FT-HE-LCC. It’s a high-performance, hot-swappable and high-density compute system.

Intended users include oil & gas companies, climate and weather modelers, manufacturers, scientific researchers and research labs. In short, anyone who requires high-performance computing (HPC).

Each 2U system comprises four nodes. And each node, in turn, is powered by a pair of 5th Gen AMD EPYC 9005 processors. (The previous-gen AMD EPYC 9004 processors are supported, too.)

Memory on this Supermicro FlexTwin maxes out at 9TB of DDR5, courtesy of up to 24 DIMM slots. Expansions connect via PCIe 5.0, with one slot per node the standard and more available as an option.

The 5th Gen AMD EPYC processors, introduced last month, are designed for data center, AI and cloud customers. The series launched with over 25 SKUs offering up to 192 cores and all using AMD’s new “Zen 5” or “Zen 5c” architectures.

Keeping Cool

To keep things chill, the Supermicro FlexTwin server is available with liquid cooling only. This allows the server to be used for HPC, electronic design automation (EDA) and other demanding workloads.

More specifically, the FlexTwin server uses a direct-to-chip (D2C) cold plate liquid cooling setup, and each system also runs 16 counter-rotating fans. Supermicro says this cooling arrangement can remove up to 90% of server-generated heat.

The server’s liquid cooling also covers the 5th gen AMD processors’ more demanding cooling requirements; they’re rated at up to 500W of thermal design power (TDP). By comparison, some members of the previous, 4th gen AMD EPYC processors have a default TDP as low as 200W.

Build & Recycle

The Supermicro FlexTwin server also adheres to the company’s “Building Block Solutions” approach. Essentially, this means end users purchase these servers by the rack.

Supermicro says its Building Blocks let users optimize for their exact workload. Users also gain efficient upgrading and scaling.

Looking even further into the future, once these servers are ready for an upgrade, they can be recycled through the Supermicro recycling program.

In Europe, Supermicro follows the EU’s Waste Electrical and Electronic Equipment (WEEE) Directive. In the U.S., recycling is free in California; users in other states may have to pay a shipping charge.

Put it all together, and you’ve got a server of the future, available to order today.

Do More:

 

Featured videos


Follow


Related Content

Tech Explainer: What is the AMD “Zen” core architecture?

Featured content

Tech Explainer: What is the AMD “Zen” core architecture?

Originally launched in 2017, this CPU architecture now delivers high performance and efficiency with ever-thinner processes.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The recent release of AMD’s 5th generation processors—formerly codenamed Turin—also heralded the introduction of the company’s “Zen 5” core architecture.

“Zen” is AMD’s name for a design ethos that prioritizes performance, scalability and efficiency. As any CTO will tell you, these 3 aspects are crucial for success in today’s AI era.

AMD originally introduced its “Zen” architecture in 2017 as part of a broader campaign to steal market share and establish dominance in the all-important enterprise IT space.

Subsequent generations of the “Zen” design have markedly increased performance and efficiency while delivering ever-thinner manufacturing processes.

Now and Zen

Since the “Zen” core’s original appearance in AMD Ryzen 1000-series processors, the architecture’s design philosophy has maintained its focus on a handful of vital aspects. They include:

  • A modular design. Known as Infinity Fabric, it facilitates efficient connectivity among multiple CPU cores and other components. This modular architecture enhances scalability and performance, both of which are vital for modern enterprise IT infrastructure.
  • High core counts and multithreading. Both are common to EPYC and Ryzen CPUs built using the AMD “Zen” core architecture. Simultaneous multithreading enables each core to process 2 threads. In the case of EPYC processors, this makes AMD’s CPUs ideal for multithreaded workloads that include Generative AI, machine learning, HPC and Big Data.
  • Advanced manufacturing processes. These allow faster, more efficient communication among individual CPU components, including multithreaded cores and multilevel caches. Back in 2017, the original “Zen” architecture was manufactured using a 14-nanometer (nm) process. Today’s new “Zen 5” and “Zen 5c” architectures (more on these below) reduce the lithography to just 4nm and 3nm, respectively.
  • Enhanced efficiency. This enables IT staff to better manage complex enterprise IT infrastructure. Reducing heat and power consumption is crucial, too, both in data centers and at the edge. The AMD “Zen” architecture makes this possible by offering enterprise-grade EPYC processors that offer up to 192 cores, yet require a maximum thermal design power (TDP) of only 500W.

The Two-Fold Path

The latest, fifth generation “Zen” architecture is divided into two segments: “Zen 5” and “Zen 5c.”

“Zen 5” employs a 4-nanometer (nm) manufacturing process to deliver up to 128 cores operating at up to 4.1GHz. It’s optimized for high per-core performance.

“Zen 5c,” by contrast, offers a 3nm lithography that’s reserved for AMD EPYC 96xx, 97xx, 98xx, and 99xx series processors. It’s optimized for high density and power efficiency.

The most powerful of these CPUs—the AMD EPYC 9965—includes an astonishing 192 cores, a maximum boost clock speed of 3.7GHz, and an L3 cache of 384MB.

Both “Zen 5” and “Zen 5c” are key components of the 5th gen AMD EPYC processors introduced earlier this month. Both have also been designed to achieve double-digit increases in instructions per clock cycle (IPC) and equip the core with the kinds of data handling and processing power required by new AI workloads.

Supermicro’s Satori

AMD isn’t the only brand offering bold, new tech to harried enterprise IT managers.

Supermicro recently introduced its new H14 servers, GPU-accelerated systems and storage servers powered by AMD EPYC 9005 Series processors and AMD Instinct MI325X Accelerators. A number of these servers also support the new AMD “Turin” CPUs.

The new product line features updated versions of Supermicro’s vaunted Hyper system, Twin multinode servers, and AI-inferencing GPU systems. All are now available with the user’s choice of either air or liquid cooling.

Supermicro says its collection of purpose-built powerhouses represents one of the industry’s most extensive server families. That should be welcome news for organizations intent on building a fleet of machines to meet the highly resource-intensive demands of modern AI workloads.

By designing its next-generation infrastructure around AMD 5th Generation components, Supermicro says it can dramatically increase efficiency by reducing customers’ total data-center footprints by at least two-thirds.

Enlightened IT for the AI Era

While AMD and Supermicro’s advances represent today’s cutting-edge technology, tomorrow is another story entirely.

Keeping up with customer demand and the dizzying pace of AI-based innovation means these tech giants will soon return with more announcements, tools and design methodologies. AMD has already promised a new accelerator, the AMD Instinct MI350, will be formally announced in the second half of 2025.

As far as enterprise CTOs are concerned, the sooner, the better. To survive and thrive amid heavy competition, they’ll need an evolving array of next-generation technology. That will help them reduce their bottom lines even as they increase their product offerings—a kind of technological nirvana.

Do More:

Watch a related video: 

Featured videos


Follow


Related Content

Do your customers need more room for AI? AMD has an answer

Featured content

Do your customers need more room for AI? AMD has an answer

If your customers are looking to add AI to already-crowded, power-strapped data centers, AMD is here to help. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

How can your customers make room for AI in data centers that are already full?

It’s a question that’s far from academic. Nine in 10 tech vendors surveyed recently by the Uptime Institute expect AI to be widely used in data centers in the next 5 years.

Yet data center space is both hard to find and costly to rent. Vacancy rates have hit new lows, according to real-estate services firm CBRE Group.

Worse, this combination of supply shortages and high demand is driving up data center pricing and rents. Across North America, CRBE says, pricing is up by 20% year-on-year.

Getting enough electric power is an issue, too. Some utilities have told prospective data-center customers they won’t get the power they requested until the next decade, reports The Wall Street Journal. In other cases, strapped utilities are simply giving customers less power than they asked for.

So how to help your customers get their data centers ready for AI? AMD has some answers. And a free software tool to help.

The AMD Solution

AMD’s solution is simple, with just 2 points:

  • Make the most of existing data-center real estate and power by consolidating existing workloads.
  • Replace the low-density compute of older, inefficient and out-of-warranty systems with compute that’s newer, denser and more efficient.

AMD is making the case that your customers can do both by moving from older Intel-based systems to newer ones that are AMD-based.

For example, the company says, replacing servers based on Intel Xeon 6143 Sky Lake processors with those based on AMD EPYC 9334 CPUs can result in the need for 73% fewer servers, 70% fewer racks and 69% less power.

That could include Supermicro servers powered by AMD EPYC processors. Supermicro H13 servers using AMD EPYC 9004 Series processors offer capabilities for high-performance data centers.

AMD hasn’t yet done comparisons with either its new 5th gen EPYC processors (introduced last week) or Intel’s 86xx CPUs. But the company says the results should be similar.

Consolidating processor-based servers can also make room in your customers’ racks for AMD Instinct MI300 Series accelerators designed specifically for AI and HPC workloads.

For example, if your customer has older servers based on Intel Xeon Cascade Lake processors, migrating them to servers based on AMD EPYC 9754 processors instead can gain them as much as a 5-to-1 consolidation.

The result? Enough power and room to accommodate a new AI platform.

Questions Answered

Simple doesn’t always mean easy. And you and your customers may have concerns.

For example, isn’t switching from one vendor to another difficult?

No, says AMD. The company cross-licenses the X86 instruction set, so on its processors, most workloads and applications will just work.

What about all those cores on AMD processors? Won’t they raise a customer’s failure domain too high?

No, says AMD. Its CPUs are scalable enough to handle any failure domain from 8 to 256 cores per server.

Wouldn’t moving require a cold migration? And if so, wouldn’t that disrupt the customer’s business?

Again, AMD says no. While moving virtual machines (VMs) to a new architecture does require a cold migration, the job can be done without any application downtime.

That’s especially true if you use AMD’s free open-source tool known as VAMT, short for VMware Architecture Migration Tool. VAMT automates cold migration. In one AMD test, it migrated hundreds of VMs in just an hour.

So if your customers among those struggling to find room for AI systems in their already-crowded and power-strapped data centers, tell them consider a move to AMD.

Do More:

 

Featured videos


Follow


Related Content

AMD intros CPUs, accelerators, networking for end-to-end AI infrastructure -- and Supermicro supports

Featured content

AMD intros CPUs, accelerators, networking for end-to-end AI infrastructure -- and Supermicro supports

AMD expanded its end-to-end AI infrastructure products for data centers with new CPUs, accelerators and network controllers. And Supermicro is already offering supporting servers. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

AMD today held a roughly two-hour conference in San Francisco during which CEO Lisa Su and other executives introduced a new generation of server processors, the next model in the Instinct MI300 Accelerator family, and new data-center networking devices.

As CEO Su told the audience the live and online audience, AMD is committed to offering end-to-end AI infrastructure products and solutions in an open, partner-dependent ecosystem.

Su further explained that AMD’s new AI strategy has 4 main goals:

  • Become the leader in end-to-end AI
  • Create an open AI software platform of libraries and models
  • Co-innovate with partners including cloud providers, OEMs and software creators
  • Offer all the pieces needed for a total AI solution, all the way from chips to racks to clusters and even entire data centers.

And here’s a look at the new data-center hardware AMD announced today.

5th Gen AMD EPYC CPUs

The EPYC line, originally launched in 2017, has become a big success for AMD. As Su told the event audience, there are now more than 950 EPYC instances at the largest cloud providers; also, AMD hardware partners now offer EPYC processors on more than 350 platforms. Market share is up, too: Nearly one in three servers worldwide (34%) now run on EPYC, Su said.

The new EPYC processors, formerly codenamed Turin and now known as the AMD EPYC 9005 Series, are now available for data center, AI and cloud customers.

The new CPUs also have a new core architecture known as Zen5. AMD says Zen5 outperforms the previous Zen4 generation by 17% on enterprise instructions-per-clock and up to 37% on AI and HPC workloads.

The new 5th Gen line has over 25 SKUs, and core count ranges widely, from as few as 8 to as many as 192. For example, the new AMD EPYC 9575F is a 65-core, 5GHz CPU designed specifically for GPU-powered AI solutions.

AMD Instinct MI325X Accelerator

About a year ago, AMD introduced the Instinct MI300 Accelerators, and since then the company committed itself to introducing new models on a yearly cadence. Sure enough, today Lisa Su introduced the newest model, the AMD Instinct MI325X Accelerator.

Designed for Generative AI performance and built on the AMD CDNA3 architecture, the new accelerator offers up to 256GB of HBM3E memory, and bandwidth up to 6TB/sec.

Shipments of the MI325X are set to begin in this year’s fourth quarter. Partner systems with the new AMD accelerator are expected to start shipping in next year’s first quarter.

Su also mentioned the next model in the line, the AMD Instinct MI350, which will offer up to 288GB of HBM3E memory. It’s set to be formally announced in the second half of next year.

Networking Devices

Forrest Norrod, AMD’s head of data-center solutions, introduced two networking devices designed for data centers running AI workloads.

The AMD Pensando Salina DPU is designed for front-end connectivity. It supports thruput of up to 400 Gbps.

The AMD Pensando Pollara 400, designed for back-end networks connecting multiple GPUs, is the industry’s first Ultra-Ethernet Consortium-ready AI NIC.

Both parts are sampling with customers now, and AMD expects to start general shipments in next year’s first half.

Both devices are needed, Norrod said, because AI dramatically raises networking demands. He cited studies showing that connectivity currently accounts for 40% to 75% of the time needed to run certain AI training and inference models.

Supermicro Support

Supermicro is among the AMD partners already ready with systems based on the new AMD processors and accelerator.

Wasting no time, Supermicro today announced new H14 series servers, including both Hyper and FlexTwin systems, that support the 5th gen AMD 9005 EPYC processors and AMD Instinct MI325X Accelerators.

The Supermicro H14 family includes three systems for AI training and inference workloads. Supermicro says the systems can also accommodate the higher thermal requirements of the new AMD EPYC processors, which are rated at up to 500W. Liquid cooling is an option, too.

Do More:

 

Featured videos


Follow


Related Content

Research Roundup: AI and data centers, cybersec spending, AI for competitive advantage & sales

Featured content

Research Roundup: AI and data centers, cybersec spending, AI for competitive advantage & sales

Catch up on the latest IT industry market research and surveys. 

Learn More about this topic
  • Applications:

The rapid adoption of artificial intelligence is putting new stress on data centers. Cybersecurity spending is growing faster than expected. Business leaders say AI is a competitive advantage. And AI could even help salespeople meet their quotas.

That’s some of the latest from top IT research and polling organizations. And here’s your roundup.

AI Needs More Juice

So much AI, so few data centers. That’s one of the more surprising side effects of the AI explosion.

Demand for data centers is rising. Also rising are data centers’ electric bills, says market watcher IDC.

All data centers use a lot of electric power. Add AI to the mix, and demand for juice rockets even higher.

That’s important because electricity already accounts for nearly half (46%) of the average enterprise data center’s total operational cost, and even more (60%) for the average service-provider’s data center, IDC says.

IDC now predicts that AI data center energy consumption will rise by a compound average growth rate (CAGR) of nearly 45% from now through 2028, when it will reach a global total of 146.2 terawatt hours.

Further, IDC expects overall global data center electricity consumption to more than double between 2023 and 2028, reaching 847 terawatt hours. That’s equivalent to a five-year CAGR of 19.5%.

Cybersec Spending: Up!

Cybersecurity spending rose nearly 10% in the second quarter of this year, reaching a worldwide total of $21.1 billion, according to industry analysts Canalys.

That fast rate of growth left Canalys surprised. It had expected both closer scrutiny of cyber budgets and slower contracts signings due to uncertainty about the economy.

Instead, vendors focused on cross-selling their platforms. Canalys says the top 12 cyber providers collectively accounted for more than half (53.2%) of total spending in Q2.

“Vendors are positioning their cybersecurity platforms to reduce customers’ complexity by consolidating redundant and legacy point products, says Canalys chief analyst Matthew Ball. “But this also reduces organizations’ resilience by increasing dependency on fewer vendors.”

Looking ahead, Canalys expects even bigger growth in spending on cyber services (as opposed to cyber technology). For the full year 2024, Canalys predicts cyber-services spending to grow by nearly 13% year-on-year, reaching a global total of $163.3 billion.

AI: The New Competitive Advantage

Nearly 7 in 10 business leaders (68%) say their organizations’ competitive advantage now depends on making the best use of artificial intelligence. So finds a new poll conducted by Forrester on behalf of credit-reporting site Experian.

In the survey, roughly 6 in 10 respondents (62%) also said their top AI use case is analyzing alternative data sources with Generative AI.

But business leaders are also looking for faster results. More than half the respondents (55%) said developing and deploying AI and machine-learning models takes them too much time.

The survey, conducted earlier this year, reached 1,320 business leaders in 10 countries across the EMEA and Asia-Pacific regions.

AI for Sales? Yes, Please

Add sales to the list of jobs that can be enhanced with AI. A new forecast from researchers at Gartner posits that sellers who partner effectively with AI tools are 3.7 times more likely to meet their quotas than are those who don’t use AI.

The forecast is based on Gartner’s recent survey of more than 1,025 B2B sellers.

Gartner also says that in response, senior sales officers will need to prepare their staff for a world with AI. That could include training salespeople with new AI skills, setting new sales priorities, and refining compensation and even career paths.

One possible snag: In Gartner’s survey, nearly three-quarters of the salespeople (72%) said they’re already overwhelmed by the number of skills required for their job. And fully half (50%) said they’re similarly overwhelmed by the amount of technology needed.

Watch the related video podcast:

Featured videos


Follow


Related Content

The AMD Instinct MI300X Accelerator draws top marks from leading AI benchmark

Featured content

The AMD Instinct MI300X Accelerator draws top marks from leading AI benchmark

In the latest MLPerf testing, the AMD Instinct MI300X Accelerator with ROCm software stack beat the competition with strong GenAI inference performance. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

New benchmarks using the AMD Instinct MI300X Accelerator show impressive performance that surpasses the competition.

This is great news for customers operating demanding AI workloads, especially those underpinned by large language models (LLMs) that require super-low latency.

Initial platform tests using MLPerf Inference v4.1 measured AMD’s flagship accelerator against the Llama 2 70B benchmark. This test is an indication for real-world applications, including natural language processing (NLP) and large-scale inferencing.

MLPerf is the industry’s leading benchmarking suite for measuring the performance of machine learning and AI workloads from domains that include vision, speech and NLP. It offers a set of open-source AI benchmarks, including rigorous tests focused on Generative AI and LLMs.

Gaining high marks from the MLPerf Inference benchmarking suite represents a significant milestone for AMD. It positions the AMD Instinct MI300X accelerator as a go-to solution for enterprise-level AI workloads.

Superior Instincts

The results of the LLaMA2-70B test are particularly significant. That’s due to the benchmark’s ability to produce an apples-to-apples comparison of competitive solutions.

In this benchmark, the AMD Instinct MI300X was compared with NVIDIA’s H100 Tensor Core GPU. The test concluded that AMD’s full-stack inference platform was better than the H100 at achieving high-performance LLMs, a workload that requires both robust parallel computing and a well-optimized software stack.

The testing also showed that because the AMD Instinct MI300X offers the largest GPU memory available—192GB of HBM3 memory—it was able to fit the entire LLaMA2-70B model into memory. Doing so helped to avoid network overhead by preventing model splitting. This, in turn, maximized inference throughput, producing superior results.

Software also played a big part in the success of the AMD Instinct series. The AMD ROCm software platform accompanies the AMD Instinct MI300X. This open software stack includes programming models, tools, compilers, libraries and runtimes for AI solution development on the AMD Instinct MI300 accelerator series and other AMD GPUs.

The testing showed that the scaling efficiency from a single AMD Instinct MI300X, combined with the ROCm software stack, to a complement of eight AMD Instinct accelerators was nearly linear. In other words, the system’s performance improved proportionally by adding more GPUs.

That test demonstrated the AMD Instinct MI300X’s ability to handle the largest MLPerf inference models to date, containing over 70 billion parameters.

Thinking Inside the Box

Benchmarking the AMD Instinct MI300X required AMD to create a complete hardware platform capable of addressing strenuous AI workloads. For this task, AMD engineers chose as their testbed the Supermicro AS -8125GS-TNMR2, a massive 8U complete system.

Supermicro’s GPU A+ Client Systems are designed for both versatility and redundancy. Designers can outfit the system with an impressive array of hardware, starting with two AMD EPYC 9004-series processors and up to 6TB of ECC DDR5 main memory.

Because AI workloads consume massive amounts of storage, Supermicro has also outfitted this 8U server with 12 front hot-swap 2.5-inch NVMe drive bays. There’s also the option to add four more drives via an additional storage controller.

The Supermicro AS -8125GS-TNMR2 also includes room for two hot-swap 2.5-inch SATA bays and two M.2 drives, each with a capacity of up to 3.84TB.

Power for all those components is delivered courtesy of six 3,000-watt redundant titanium-level power supplies.

Coming Soon: Even More AI power

AMD engineers continually push the limits of silicon and human ingenuity to expand the capabilities of their hardware. So it should come as little surprise that new iterations of the AMD Instinct series are expected to be released in the coming months. This past May, AMD officials said they plan to introduce AMD Instinct MI325, MI350 and MI400 accelerators.

Forthcoming Instinct accelerators, AMD says, will deliver advances including additional memory, support for lower-precision data types, and increased compute power.

New features are also coming to the AMD ROCm software stack. Those changes should include software enhancements including kernel improvements and advanced quantization support.

Are you customers looking for a high-powered, low-latency system to run their most demanding HPC and AI workloads? Tell them about these benchmarks and the AMD Instinct MI300X accelerators.

Do More:

 

Featured videos


Follow


Related Content

Developing AI and HPC solutions? Check out the new AMD ROCm 6.2 release

Featured content

Developing AI and HPC solutions? Check out the new AMD ROCm 6.2 release

The latest release of AMD’s free and open software stack for developing AI and HPC solutions delivers 5 important enhancements. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

If you develop AI and HPC solutions, you’ll want to know about the most recent release of AMD ROCm software, version 6.2.

ROCm, in case you’re unfamiliar with it, is AMD’s free and open software stack. It’s aimed at developers of artificial intelligence and high-performance computing (HPC) solutions on AMD Instinct accelerators. It's also great for developing AI and HPC solutions on AMD Instinct-powered servers from Supermicro. 

First introduced in 2016, ROCm open software now includes programming models, tools, compilers, libraries, runtimes and APIs for GPU programming.

ROCm version 6.2, announced recently by AMD, delivers 5 key enhancements:

  • Improved vLLM support 
  • Boosted memory efficiency & performance with Bitsandbytes
  • New Offline Installer Creator
  • New Omnitrace & Omniperf Profiler Tools (beta)
  • Broader FP8 support

Let’s look at each separately and in more detail.

LLM support

To enhance the efficiency and scalability of its Instinct accelerators, AMD is expanding vLLM support. vLLM is an easy-to-use library for the large language models (LLMs) that power Generative AI.

ROCm 6.2 lets AMD Instinct developers integrate vLLM into their AI pipelines. The benefits include improved performance and efficiency.

Bitsandbytes

Developers can now integrate Bitsandbytes with ROCm for AI model training and inference, reducing their memory and hardware requirements on AMD Instinct accelerators. 

Bitsandbytes is an open source Python library that enables LLMs while boosting memory efficiency and performance. AMD says this will let AI developers work with larger models on limited hardware, broadening access, saving costs and expanding opportunities for innovation.

Offline Installer Creator

The new ROCm Offline Installer Creator aims to simplify the installation process. This tool creates a single installer file that includes all necessary dependencies.

That makes deployment straightforward with a user-friendly GUI that allows easy selection of ROCm components and versions.

As the name implies, the Offline Installer Creator can be used on developer systems that lack internet access.

Omnitrace and Omniperf Profiler

The new Omnitrace and Omniperf Profiler Tools, both now in beta release, provide comprehensive performance analysis and a streamlined development workflow.

Omnitrace offers a holistic view of system performance across CPUs, GPUs, NICs and network fabrics. This helps developers ID and address bottlenecks.

Omniperf delivers detailed GPU kernel analysis for fine-tuning.

Together, these tools help to ensure efficient use of developer resources, leading to faster AI training, AI inference and HPC simulations.

FP8 Support

Broader FP8 support can improve the performance of AI inferencing.

FP8 is an 8-bit floating point format that provides a common, interchangeable format for both AI training and inference. It lets AI models operate and perform consistently across hardware platforms.

In ROCm, FP8 support improves the process of running AI models, particularly in inferencing. It does this by addressing key challenges such as the memory bottlenecks and high latency associated with higher-precision formats. In addition, FP8's reduced precision calculations can decrease the latency involved in data transfers and computations, losing little to no accuracy.  

ROCm 6.2 expands FP8 support across its ecosystem, from frameworks to libraries and more, enhancing performance and efficiency.

Do More:

Watch the related video podcast:

Featured videos


Follow


Related Content

Research Roundup, AI Edition: platform power, mixed signals on GenAI, smarter PCs

Featured content

Research Roundup, AI Edition: platform power, mixed signals on GenAI, smarter PCs

Catch the latest AI insights from leading researchers and market analysts.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Sales of artificial intelligence platform software show no sign of a slowdown. The road to true Generative AI disruption could be bumpy. And PCs with built-in AI capabilities are starting to sell.

That’s some of the latest AI insights from leading market researchers, analysts and pollsters. And here’s your research roundup.

AI Platforms Maintain Momentum

Is the excitement around AI overblown? Not at all, says market watcher IDC.

“The AI platforms market shows no sign of slowing down,” says IDC VP Ritu Jyoti.

IDC now believes that the market for AI platform software will maintain its momentum through at least 2028.

By that year, IDC expects, worldwide revenue for AI software will reach $153 billion. If so, that would mark a five-year compound annual growth rate (CAGR) of nearly 41%.

The market really got underway last year. That’s when worldwide AI platform software revenue hit $27.9 billion, an annual increase of 44%, IDC says.

Since then, lots of progress has been made. Fully half the organizations now deploying GenAI in production have already selected an AI platform. And IDC says most of the rest will do so in the next six months.

All that has AI software suppliers looking pretty smart.

Mixed Signals on GenAI

There’s no question that GenAI is having a huge impact. The question is how difficult it will be for GenAI-using organizations to achieve their desired results.

GenAI use is already widespread. In a global survey conducted earlier this year by management consultants McKinsey & Co., 65% of respondents said they use GenAI on a regular basis.

That was nearly double the percentage from McKinsey’s previous survey, conducted just 10 months earlier.

Also, three quarters of McKinsey’s respondents said they expect GenAI will lead their industries to significant or disruptive changes.

However, the road to GenAI could be bumpy. Separately, researchers at Gartner are predicting that by the end of 2025, at least 30% of all GenAI projects will be abandoned after their proof-of-concept (PoC). 

The reason? Gartner points to several factors: poor data quality, inadequate risk controls, unclear business value, and escalating costs.

“Executives are impatient to see returns on GenAI investments,” says Gartner VP Rita Sallam. “Yet organizations are struggling to prove and realize value.”

One big challenge: Many organizations investing in GenAI want productivity enhancements. But as Gartner points out, those gains can be difficult to quantify.

Further, implementing GenAI is far from cheap. Gartner’s research finds that a typical GenAI deployment costs anywhere from $5 million to $20 million.

That wide range of costs is due to several factors. These include the use cases involved, the deployment approaches used, and whether an organization seeks to be a market disruptor.

Clearly, an intelligent approach to GenAI can be a money-saver.

PCs with AI? Yes, Please

Leading PC makers hope to boost their hardware sales by offering new, built-in AI capabilities. It seems to be working.

In the second quarter of this year, 8.8 million PCs—that’s 14% of all shipped globally in the quarter—were AI-capable, says market analysts Canalys.

Canalys defines “AI-capable” pretty simply: It’s any desktop or notebook system that includes a chipset or block for one or more dedicated AI workloads.

By operating system, nearly 40% of the AI-capable PC shipped in Q2 were Windows systems, 60% were Apple macOS systems, and just 1% ran ChromeOS, Canalys says.

For the full year 2024, Canalys expects some 44 million AI-capable PCs to be shipped worldwide. In 2025, the market watcher predicts, these shipments should more than double, rising to 103 million units worldwide. There's nothing artificial about that boost.

Do more:

 

Featured videos


Follow


Related Content

Why Lamini offers LLM tuning software on Supermicro servers powered by AMD processors

Featured content

Why Lamini offers LLM tuning software on Supermicro servers powered by AMD processors

Lamini, provider of an LLM platform for developers, turns to Supermicro’s high-performance servers powered by AMD CPUs and GPUs to run its new Memory Tuning stack.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Generative AI systems powered by large language models (LLMs) have a serious problem: Their answers can be inaccurate—and sometimes, in the case of AI “hallucinations,” even fictional.

For users, the challenge is equally serious: How do you get precise factual accuracy—that is, correct answers with zero hallucinations—while upholding the generalization capabilities that make LLMs so valuable?

A California-based company, Lamini, has come up with an innovative solution. And its software stack runs on Supermicro servers powered by AMD CPUs and GPUs.

Why Hallucinations Happen

Here’s the premise underlying Lamini’s solution: Hallucinations happen because the right answer is clustered with other, incorrect answers. As a result, the model doesn’t know that a nearly right answer is in fact wrong.

To address this issue, Lamini’s Memory Tuning solution teaches the model that getting the answer nearly right is the same as getting it completely wrong. Its software does this by tuning literally millions of expert adapters with precise facts on top of any open-source LLM, such as Llama 3 or Mistral 3.

The Lamini model retrieves only the most relevant experts from an index at inference time. The goal is high accuracy, high speed and low cost.

More than Fine-Tuning

Isn’t this just LLM fine-tuning? Lamini says no, its Memory Tuning is fundamentally different.

Fine-tuning can’t ensure that a model’s answers are faithful to the facts in its training data. By contrast, Lamini says, its solution has been designed to deliver output probabilities that are not just close, but exactly right.

More specifically, Lamini promises its solution can deliver 95% LLM accuracy with 10x fewer hallucinations.

In the real world, Lamini says one large customer used its solution and raised LLM accuracy from 50% to 95%, and reduced the rate of AI hallucinations from an unreliable 50% to just 5%.

Investors are certainly impressed. Earlier this year Lamini raised $25 million from an investment group that included Amplify Partners, Bernard Arnault and AMD Ventures. Lamini plans to use the funding to accelerate its expert AI development and expand its cloud infrastructure.

Supermicro Solution

As part of its push to offer superior LLM tuning, Lamini chose Supermicro’s GPU server — model number AS -8125S-TNMR2 — to train LLM models in a reasonable time.

This Supermicro 8U system is powered by dual AMD EPYC 9000 series CPUs and eight AMD Instinct MI300X GPUs.

The GPUs connect with CPUs via a standard PCIe 5 bus. This gives fast access when the CPU issues commands or sends data from host memory to the GPUs.

Lamini has also benefited from Supermicro’s capacity and quick delivery schedule. With other GPUs makers facing serious capacity issues, that’s an important benefit for both Lamini and its customers.

“We’re thrilled to be working with Supermicro,” says Lamini co-founder and CEO Sharon Zhou.

Could your customers be thrilled by Lamini, too? Check out the “do more” links below.

Do More:

 

Featured videos


Follow


Related Content

Research Roundup: AI boosts project management & supply chains, HR woes, SMB supplier overload

Featured content

Research Roundup: AI boosts project management & supply chains, HR woes, SMB supplier overload

Catch up on the latest IT market intelligence from leading researchers.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Artificial intelligence is boosting both project management and supply chains. Cybersecurity spending is on a tear. And small and midsize businesses are struggling with more suppliers than employees.

That’s some of the latest IT intelligence from leading industry watchers. And here’s your research roundup.

AI for PM 

What’s artificial intelligence good for? One area is project management.

In a new survey, nearly two-thirds of project managers (63%) reported improved productivity and efficiency with AI integration.

The survey was conducted by Capterra, an online marketplace for software and services. As part of a larger survey, the company polled 2,500 project managers in 12 countries.

Nearly half the respondents (46%) said they use AI in their project management tools. Capterra then dug in deeper with this second group—totaling 1,153 project managers—to learn what kinds of benefits they’re enjoying with AI.

Among the findings:

  • Over half the AI-using project managers (54%) said they use the technology for risk management. That’s the top use case reported.
  • Project managers plan to increase their AI spending by an average of 36%.
  • Nine in 10 project managers (90%) said their AI investments earned a positive return in the last 12 months.
  • Improved productivity as a result of using AI was reported by nearly two-thirds of the respondents (63%).
  • Looking ahead, respondents expect the areas of greatest impact from AI to be task automation, predictive analytics and project planning.

AI for Supply Chains, Too

A new report from consulting firm Accenture finds that the most mature supply chains are 23% more profitable than others. These supply-chain leaders are also six times more likely than others to use AI and Generative AI widely.

To figure this out, Accenture analyzed nearly 1,150 companies in 15 countries and 10 industries. Accenture then identified the 10% of companies that scored highest on its supply-chain maturity scale.

This scale was based on the degree to which an organization uses GenAI, advanced machine learning and other new technologies for autonomous decision-making, advanced simulations and continuous improvement. The more an organization does this, the higher was their score.

Accenture also found that supply-chain leaders achieved an average profit margin of 11.8%, compared with an average margin of 9.6% among the others. (That’s the 23% profit gain mentioned earlier.) The leaders also delivered 15% better returns to shareholders: 8.5% vs. 7.4% for others.

HR: Help Wanted 

If solving customer pain points is high on your agenda—and it should be—then here’s a new pain point to consider: Fewer than 1 in 4 human relations functions say they’re getting full business value from their HR technology.

In other words, something like 75% of HR executives could use some IT help. That’s a lot of business.

The assessment comes from research and analysis firm Gartner, based on its survey of 85 HR leaders conducted earlier this year. Among Gartner’s findings:

  • Only about 1 in 3 HR executives (35%) feel confident that their approach to HR technology helps to achieve their organization’s business objectives.
  • Two out of three HR executives believe their HR function’s effectiveness will be hurt if they don’t improve their technology.

Employees are unhappy with HR technology, too. Earlier this year, Gartner also surveyed more than 1,200 employees. Nearly 7 in 10 reported experiencing at least one barrier when interacting with HR technology over the previous 12 months.

Cybersecurity’s Big Spend

Looking for a growth market? Don’t overlook cybersecurity.

Last year, worldwide spending on cybersecurity products totaled $106.8 billion. That’s a lot of money. But event better, it marked a 15% increase over the previous year’s spending, according to market watcher IDC.

Looking ahead, IDC expects this double-digit growth rate to continue for at least the next five years. By 2028, IDC predicts, worldwide spending on cybersecurity products will reach $200 billion—nearly double what was spent in 2023.

By category, the biggest cybersecurity spending last year went to network security: $27.4 billion. After that came endpoint security ($21.6 billion last year) and security analytics ($20 billion), IDC says.

Why such strong spending? In part because cybersecurity is now a board-level topic.

“Cyber risk,” says Frank Dickson, head of IDC’s security and trust research, “is business risk.”

SMBs: Too Many Suppliers

It’s not easy standing out as a supplier of small and midsize business customers. A new survey finds the average SMB has nine times more suppliers than it does employees—and actually uses only about 1 in 4 of those suppliers.

The survey, conducted by spend-management system supplier Spendesk, focused on customers in Europe. (Which makes sense, as Spendesk is headquartered in Paris.) Spendesk examined 4.7 million suppliers used by a sample of its 5,000 customers in the UK, France, Germany and Spain.

Keeping many suppliers while using only a few of them? That’s not only inefficient, but also costly. Spendesk estimates that its SMB customers could be collectively losing some $1.24 billion in wasted time and management costs.

And there’s more at stake, too. A recent study by management consultants McKinsey & Co. finds that small and midsize organizations—those with anywhere from 1 to 200 employees—are actually big business.

By McKinsey’s reckoning, SMBs account for more than 90% of all businesses by number … roughly half the global GDP … and more than two-thirds of all business jobs.

Fun fact: Nearly 1 in 5 of the largest businesses originally started as small businesses.

Do More:

 

Featured videos


Follow


Related Content

Pages