Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Research Roundup, AI Edition: platform power, mixed signals on GenAI, smarter PCs

Featured content

Research Roundup, AI Edition: platform power, mixed signals on GenAI, smarter PCs

Catch the latest AI insights from leading researchers and market analysts.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Sales of artificial intelligence platform software show no sign of a slowdown. The road to true Generative AI disruption could be bumpy. And PCs with built-in AI capabilities are starting to sell.

That’s some of the latest AI insights from leading market researchers, analysts and pollsters. And here’s your research roundup.

AI Platforms Maintain Momentum

Is the excitement around AI overblown? Not at all, says market watcher IDC.

“The AI platforms market shows no sign of slowing down,” says IDC VP Ritu Jyoti.

IDC now believes that the market for AI platform software will maintain its momentum through at least 2028.

By that year, IDC expects, worldwide revenue for AI software will reach $153 billion. If so, that would mark a five-year compound annual growth rate (CAGR) of nearly 41%.

The market really got underway last year. That’s when worldwide AI platform software revenue hit $27.9 billion, an annual increase of 44%, IDC says.

Since then, lots of progress has been made. Fully half the organizations now deploying GenAI in production have already selected an AI platform. And IDC says most of the rest will do so in the next six months.

All that has AI software suppliers looking pretty smart.

Mixed Signals on GenAI

There’s no question that GenAI is having a huge impact. The question is how difficult it will be for GenAI-using organizations to achieve their desired results.

GenAI use is already widespread. In a global survey conducted earlier this year by management consultants McKinsey & Co., 65% of respondents said they use GenAI on a regular basis.

That was nearly double the percentage from McKinsey’s previous survey, conducted just 10 months earlier.

Also, three quarters of McKinsey’s respondents said they expect GenAI will lead their industries to significant or disruptive changes.

However, the road to GenAI could be bumpy. Separately, researchers at Gartner are predicting that by the end of 2025, at least 30% of all GenAI projects will be abandoned after their proof-of-concept (PoC). 

The reason? Gartner points to several factors: poor data quality, inadequate risk controls, unclear business value, and escalating costs.

“Executives are impatient to see returns on GenAI investments,” says Gartner VP Rita Sallam. “Yet organizations are struggling to prove and realize value.”

One big challenge: Many organizations investing in GenAI want productivity enhancements. But as Gartner points out, those gains can be difficult to quantify.

Further, implementing GenAI is far from cheap. Gartner’s research finds that a typical GenAI deployment costs anywhere from $5 million to $20 million.

That wide range of costs is due to several factors. These include the use cases involved, the deployment approaches used, and whether an organization seeks to be a market disruptor.

Clearly, an intelligent approach to GenAI can be a money-saver.

PCs with AI? Yes, Please

Leading PC makers hope to boost their hardware sales by offering new, built-in AI capabilities. It seems to be working.

In the second quarter of this year, 8.8 million PCs—that’s 14% of all shipped globally in the quarter—were AI-capable, says market analysts Canalys.

Canalys defines “AI-capable” pretty simply: It’s any desktop or notebook system that includes a chipset or block for one or more dedicated AI workloads.

By operating system, nearly 40% of the AI-capable PC shipped in Q2 were Windows systems, 60% were Apple macOS systems, and just 1% ran ChromeOS, Canalys says.

For the full year 2024, Canalys expects some 44 million AI-capable PCs to be shipped worldwide. In 2025, the market watcher predicts, these shipments should more than double, rising to 103 million units worldwide. There's nothing artificial about that boost.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Why Lamini offers LLM tuning software on Supermicro servers powered by AMD processors

Featured content

Why Lamini offers LLM tuning software on Supermicro servers powered by AMD processors

Lamini, provider of an LLM platform for developers, turns to Supermicro’s high-performance servers powered by AMD CPUs and GPUs to run its new Memory Tuning stack.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Generative AI systems powered by large language models (LLMs) have a serious problem: Their answers can be inaccurate—and sometimes, in the case of AI “hallucinations,” even fictional.

For users, the challenge is equally serious: How do you get precise factual accuracy—that is, correct answers with zero hallucinations—while upholding the generalization capabilities that make LLMs so valuable?

A California-based company, Lamini, has come up with an innovative solution. And its software stack runs on Supermicro servers powered by AMD CPUs and GPUs.

Why Hallucinations Happen

Here’s the premise underlying Lamini’s solution: Hallucinations happen because the right answer is clustered with other, incorrect answers. As a result, the model doesn’t know that a nearly right answer is in fact wrong.

To address this issue, Lamini’s Memory Tuning solution teaches the model that getting the answer nearly right is the same as getting it completely wrong. Its software does this by tuning literally millions of expert adapters with precise facts on top of any open-source LLM, such as Llama 3 or Mistral 3.

The Lamini model retrieves only the most relevant experts from an index at inference time. The goal is high accuracy, high speed and low cost.

More than Fine-Tuning

Isn’t this just LLM fine-tuning? Lamini says no, its Memory Tuning is fundamentally different.

Fine-tuning can’t ensure that a model’s answers are faithful to the facts in its training data. By contrast, Lamini says, its solution has been designed to deliver output probabilities that are not just close, but exactly right.

More specifically, Lamini promises its solution can deliver 95% LLM accuracy with 10x fewer hallucinations.

In the real world, Lamini says one large customer used its solution and raised LLM accuracy from 50% to 95%, and reduced the rate of AI hallucinations from an unreliable 50% to just 5%.

Investors are certainly impressed. Earlier this year Lamini raised $25 million from an investment group that included Amplify Partners, Bernard Arnault and AMD Ventures. Lamini plans to use the funding to accelerate its expert AI development and expand its cloud infrastructure.

Supermicro Solution

As part of its push to offer superior LLM tuning, Lamini chose Supermicro’s GPU server — model number AS -8125S-TNMR2 — to train LLM models in a reasonable time.

This Supermicro 8U system is powered by dual AMD EPYC 9000 series CPUs and eight AMD Instinct MI300X GPUs.

The GPUs connect with CPUs via a standard PCIe 5 bus. This gives fast access when the CPU issues commands or sends data from host memory to the GPUs.

Lamini has also benefited from Supermicro’s capacity and quick delivery schedule. With other GPUs makers facing serious capacity issues, that’s an important benefit for both Lamini and its customers.

“We’re thrilled to be working with Supermicro,” says Lamini co-founder and CEO Sharon Zhou.

Could your customers be thrilled by Lamini, too? Check out the “do more” links below.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Research Roundup: AI boosts project management & supply chains, HR woes, SMB supplier overload

Featured content

Research Roundup: AI boosts project management & supply chains, HR woes, SMB supplier overload

Catch up on the latest IT market intelligence from leading researchers.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Artificial intelligence is boosting both project management and supply chains. Cybersecurity spending is on a tear. And small and midsize businesses are struggling with more suppliers than employees.

That’s some of the latest IT intelligence from leading industry watchers. And here’s your research roundup.

AI for PM 

What’s artificial intelligence good for? One area is project management.

In a new survey, nearly two-thirds of project managers (63%) reported improved productivity and efficiency with AI integration.

The survey was conducted by Capterra, an online marketplace for software and services. As part of a larger survey, the company polled 2,500 project managers in 12 countries.

Nearly half the respondents (46%) said they use AI in their project management tools. Capterra then dug in deeper with this second group—totaling 1,153 project managers—to learn what kinds of benefits they’re enjoying with AI.

Among the findings:

  • Over half the AI-using project managers (54%) said they use the technology for risk management. That’s the top use case reported.
  • Project managers plan to increase their AI spending by an average of 36%.
  • Nine in 10 project managers (90%) said their AI investments earned a positive return in the last 12 months.
  • Improved productivity as a result of using AI was reported by nearly two-thirds of the respondents (63%).
  • Looking ahead, respondents expect the areas of greatest impact from AI to be task automation, predictive analytics and project planning.

AI for Supply Chains, Too

A new report from consulting firm Accenture finds that the most mature supply chains are 23% more profitable than others. These supply-chain leaders are also six times more likely than others to use AI and Generative AI widely.

To figure this out, Accenture analyzed nearly 1,150 companies in 15 countries and 10 industries. Accenture then identified the 10% of companies that scored highest on its supply-chain maturity scale.

This scale was based on the degree to which an organization uses GenAI, advanced machine learning and other new technologies for autonomous decision-making, advanced simulations and continuous improvement. The more an organization does this, the higher was their score.

Accenture also found that supply-chain leaders achieved an average profit margin of 11.8%, compared with an average margin of 9.6% among the others. (That’s the 23% profit gain mentioned earlier.) The leaders also delivered 15% better returns to shareholders: 8.5% vs. 7.4% for others.

HR: Help Wanted 

If solving customer pain points is high on your agenda—and it should be—then here’s a new pain point to consider: Fewer than 1 in 4 human relations functions say they’re getting full business value from their HR technology.

In other words, something like 75% of HR executives could use some IT help. That’s a lot of business.

The assessment comes from research and analysis firm Gartner, based on its survey of 85 HR leaders conducted earlier this year. Among Gartner’s findings:

  • Only about 1 in 3 HR executives (35%) feel confident that their approach to HR technology helps to achieve their organization’s business objectives.
  • Two out of three HR executives believe their HR function’s effectiveness will be hurt if they don’t improve their technology.

Employees are unhappy with HR technology, too. Earlier this year, Gartner also surveyed more than 1,200 employees. Nearly 7 in 10 reported experiencing at least one barrier when interacting with HR technology over the previous 12 months.

Cybersecurity’s Big Spend

Looking for a growth market? Don’t overlook cybersecurity.

Last year, worldwide spending on cybersecurity products totaled $106.8 billion. That’s a lot of money. But event better, it marked a 15% increase over the previous year’s spending, according to market watcher IDC.

Looking ahead, IDC expects this double-digit growth rate to continue for at least the next five years. By 2028, IDC predicts, worldwide spending on cybersecurity products will reach $200 billion—nearly double what was spent in 2023.

By category, the biggest cybersecurity spending last year went to network security: $27.4 billion. After that came endpoint security ($21.6 billion last year) and security analytics ($20 billion), IDC says.

Why such strong spending? In part because cybersecurity is now a board-level topic.

“Cyber risk,” says Frank Dickson, head of IDC’s security and trust research, “is business risk.”

SMBs: Too Many Suppliers

It’s not easy standing out as a supplier of small and midsize business customers. A new survey finds the average SMB has nine times more suppliers than it does employees—and actually uses only about 1 in 4 of those suppliers.

The survey, conducted by spend-management system supplier Spendesk, focused on customers in Europe. (Which makes sense, as Spendesk is headquartered in Paris.) Spendesk examined 4.7 million suppliers used by a sample of its 5,000 customers in the UK, France, Germany and Spain.

Keeping many suppliers while using only a few of them? That’s not only inefficient, but also costly. Spendesk estimates that its SMB customers could be collectively losing some $1.24 billion in wasted time and management costs.

And there’s more at stake, too. A recent study by management consultants McKinsey & Co. finds that small and midsize organizations—those with anywhere from 1 to 200 employees—are actually big business.

By McKinsey’s reckoning, SMBs account for more than 90% of all businesses by number … roughly half the global GDP … and more than two-thirds of all business jobs.

Fun fact: Nearly 1 in 5 of the largest businesses originally started as small businesses.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

HBM: Your memory solution for AI & HPC

Featured content

HBM: Your memory solution for AI & HPC

High-bandwidth memory shortens the information commute to keep pace with today’s powerful GPUs.

Learn More about this topic
  • Applications:
  • Featured Technologies:

As AI powered by GPUs transforms computing, conventional DDR memory can’t keep up.

The solution? High-bandwidth memory (HBM).

HBM is memory chip technology that essentially shortens the information commute. It does this using ultra-wide communication lanes.

An HBM device contains vertically stacked memory chips. They’re interconnected by microscopic wires known as through-silicon vias, or TSVs for short.

HBM also provides more bandwidth per watt. And, with a smaller footprint, the technology can also save valuable data-center space.

Here’s how: A single HBM stack can contain up to eight DRAM modules, with each module connected by two channels. This makes an HBM implementation of just four chips roughly equivalent to 30 DDR modules, and in a fraction of the space.

All this makes HBM ideal for workloads that utilize AI and machine learning, HPC, advanced graphics and data analytics.

Latest & Greatest

The latest iteration, HBM3, was introduced in 2022, and it’s now finding wide application in market-ready systems.

Compared with the previous version, HBM3 adds several enhancements:

  • Higher bandwidth: Up to 819 GB/sec., up from HBM2’s max of 460 GB/sec.
  • More memory capacity: 24GB per stack, up from HBM2’s 8GB
  • Improved power efficiency: Delivering more data throughput per watt
  • Reduced form factor: Thanks to a more compact design

However, it’s not all sunshine and rainbows. For one, HBM-equipped systems are more expensive than those fitted out with traditional memory solutions.

Also, HBM stacks generate considerable heat. Advanced cooling systems are often needed, adding further complexity and cost.

Compatibility is yet another challenge. Systems must be designed or adapted to HBM3’s unique interface and form factor.

In the Market

As mentioned above, HBM3 is showing up in new products. That very definitely includes both the AMD Instinct MI300A and MI300X series accelerators.

The AMD Instinct MI300A accelerator combines a CPU and GPU for running HPC/AI workloads. It offers HBM3 as the dedicated memory with a unified capacity of up to 128GB.

Similarly, the AMD Instinct MI300X is a GPU-only accelerator designed for low-latency AI processing. It contains HBM3 as the dedicated memory, but with a higher capacity of up to 192GB.

For both of these AMD Instinct MI300 accelerators, the peak theoretical memory bandwidth is a speedy 5.3TB/sec.

The AMD Instinct MI300X is also the main processor in Supermicro’s AS -8125GS-TNMR2, an H13 8U 8-GPU system. This system offers a huge 1.5TB of HBM3 memory in single-server mode, and an even huger 6.144TB at rack scale.

Are your customers running AI with fast GPUs, only to have their systems held back by conventional memory? Tell them to check out HBM.

Do More:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

At Computex, AMD & Supermicro CEOs describe AI advances you’ll be adopting soon

Featured content

At Computex, AMD & Supermicro CEOs describe AI advances you’ll be adopting soon

At Computex Taiwan, Lisa Su of AMD and Charles Liang of Supermicro delivered keynotes that focused on AI, liquid cooling and energy efficiency.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The chief executives of both AMD and Supermicro used their Computex keynote addresses to describe their companies’ AI products and, in the case of AMD, pre-announce important forthcoming products.

Computex 2024 was held this past week in Taipei, Taiwan, with the conference theme of “connecting AI.” Exhibitors featured some 1,500 companies from around the world, and keynotes were delivered by some of the IT industry’s top executives.

That included Lisa Su, chairman and CEO of AMD, and Charles Liang, founder and CEO of Supermicro. Here's some of what they previewed at Computex 2024

Lisa Su, AMD: Top priority is AI

Su of AMD presented one of this Computex’s first keynotes. Anyone who thought she might discuss topics other than AI was quickly set straight.

“AI is our number one priority,” Su told the crowd. “We’re at the beginning of an incredibly exciting time for the industry as AI transforms virtually every business, improves our quality of life, and reshapes every part of the computing market.”

AMD intends to lead in AI solutions by focusing on three priorities, she added: delivering a broad portfolio of high-performance, energy-efficient compute engines (including CPUs, GPUs and NPUs); enabling an open and developer-friendly ecosystem; and co-innovating with partners.

The latter point was supported during Su’s keynote by brief visits from several partner leaders. They included Pavan Dhavulari, corporate VP of Windows devices at Microsoft; Christian Laforte, CTO of Stability AI; and (via a video link) Microsoft CEO Satya Nadella.

Fairly late in Su’s hour-plus keynote, she held up AMD’s forthcoming 5th gen EPYC server processor, codenamed Turin. It’s scheduled to ship by year’s end.

As Su explained, Turin will feature up to 192 cores and 384 threads, up from the current generation’s max of 128 cores and 256 threads. Turin will contain 13 chiplets built in both 3-nm and 6-nm processor technology. Yet it will be available as a drop-in replacement for existing EPYC platforms, Su said.

Turin processors will use AMD’s new ‘Zen5’ cores, which Su also announced at Computex. She described AMD’s ‘Zen5’ as “the highest performance and most energy-efficient core we’ve ever built.”

Su also discussed AMD’s MI3xx family of accelerators. The MI300, introduced this past December, has become the fastest ramping product in AMD’s history, she said. Microsoft’s Nadella, during his short presentation, bragged that his company’s cloud was the first to deliver general availability of virtual machines using the AMD MI300X accelerator.

Looking ahead, Su discussed three forthcoming Instinct accelerators on AMD’s road map: The MI325, MI350 and MI400 series.

The AMD Instinct MI325, set to launch later this year, will feature more memory (up to 288GB) and higher memory bandwidth (6TB/sec.) than the MI300. But the new component will still use the same infrastructure as the MI300, making it easy for customers to upgrade.

The next series, MI350, is set for launch next year, Su said. It will then use AMD’s new CDNA4 architecture, which Su said “will deliver the biggest generational AI leap in our history.” The MI350 will be built on 3nm process technology, but will still offer a drop-in upgrade from both the MI300 and MI325.

The last of the three, the MI400 series, is set to start shipping in 2026. That’s also when AMD will deliver a new generation of CDNA, according to Su.

Both the MI325 and MI350 series will leverage the same industry standard universal baseboard OCP server design used by MI300. Su added: “What that means is, our customers can adopt this new technology very quickly.”

Charles Liang, Supermicro: Liquid cooling is the AI future

Liang dedicated his Computex keynote to the topics of liquid cooling and “green” computing.

“Together with our partners,” he said, “we are on a mission to build the most sustainable data centers.”

Liang predicted a big change from the present, where direct liquid cooling (DLC) has a less-than-1% share of the data center market. Supermicro is targeting 15% of new data center deployments in the next year, and Liang hopes that will hit 30% in the next two years.

Driving this shift, he added, are several trends. One, of course, is the huge uptake of AI, which requires high-capacity computing.

Another is the improvement of DLC technology itself. Where DLC system installations used to take 4 to 12 months, Supermicro is now doing them in just 2 to 4 weeks, Liang said. Where liquid cooling used to be quite expensive, now—when TCO and energy savings are factored in—“DLC can be free, with a big bonus,” he said. And where DLC systems used to be unreliable, now they are high performing with excellent uptime.

Supermicro now has capacity to ship 1,000 rack scale solutions with liquid cooling per month, Liang said. In fact, the company is shipping over 50 liquid-cooled racks per day, with installations typically completed within just 2 weeks.

“DLC,” Liang said, “is the wave of the future.”

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Research Roundup: AI edition

Featured content

Research Roundup: AI edition

Catch up on the latest research and analysis around artificial intelligence.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Generative AI is the No. 1 AI solution being deployed. Three in 4 knowledge workers are already using AI. The supply of workers with AI skills can’t meet the demand. And supply chains can be helped by AI, too.

Here’s your roundup of the latest in AI research and analysis.

GenAI is No. 1

Generative AI isn’t just a good idea, it’s now the No. 1 type of AI solution being deployed.

In a survey recently conducted by research and analysis firm Gartner, more than a quarter of respondents (29%) said they’ve deployed and are now using GenAI.

That was a higher percentage than any other type of AI in the survey, including natural language processing, machine learning and rule-based systems.

The most common way of using GenAI, the survey found, is embedding it in existing applications. For example, using Microsoft Copilot for 365. This was cited by about 1 in 3 respondents (34%).

Other approaches mentioned by respondents included prompt engineering (cited by 25%), fine-tuning (21%) and using standalone tools such as ChatGPT (19%).

Yet respondents said only about half of their AI projects (48%) make it into production. Even when that happens, it’s slow. Moving an AI project from prototype to production took respondents an average of 8 months.

Other challenges loom, too. Nearly half the respondents (49%) said it’s difficult to estimate and demonstrate an AI project’s value. They also cited a lack of talent and skills (42%), lack of confidence in AI technology (40%) and lack of data (39%).

Gartner conducted the survey in last year’s fourth quarter and released the results earlier this month. In all, valid responses were culled from 644 executives working for organizations in the United States, the UK and Germany.

AI ‘gets real’ at work

Three in 4 knowledge workers (75%) now use AI at work, according to the 2024 Work Trend Index, a joint project of Microsoft and LinkedIn.

Among these users, nearly 8 in 10 (78%) are bringing their own AI tools to work. That’s inspired a new acronym: BYOAI, short for Bring Your Own AI.

“2024 is the year AI at work gets real,” the Work Trend report says.

2024 is also a year of real challenges. Like the Gartner survey, the Work Trend report finds that demonstrating AI’s value can be tough.

In the Microsoft/LinkedIn survey, nearly 8 in 10 leaders agreed that adopting AI is critical to staying competitive. Yet nearly 6 in 10 said they worry about quantifying the technology’s productivity gains. About the same percentage also said their organization lacks an AI vision and plan.

The Work Trend report also highlights the mismatch between AI skills demand and supply. Over half the leaders surveyed (55%) say they’re concerned about having enough AI talent. And nearly two-thirds (65%) say they wouldn’t hire someone who lacked AI skills.

Yet fewer than 4 in 10 users (39%) have received AI training from their company. And only 1 in 4 companies plan to offer AI training this year.

The Work Trend report is based on a mix of sources: a survey of 31,000 people in 31 countries; labor and hiring trends on the LinkedIn site; Microsoft 365 productivity signals; and research with Fortune 500 customers.

AI skills: supply-demand mismatch

The mismatch between AI skills supply and demand was also examined recently by market watcher IDC. It expects that by 2026, 9 of every 10 organizations will be hurt by an overall IT skills shortage. This will lead to delays, quality issues and revenue loss that IDC predicts will collectively cost these organizations $5.5 trillion.

To be sure, AI skills are currently the most in-demand skill for most organizations. The good news, IDC finds, is that more than half of organizations are now using or piloting training for GenAI.

“Getting the right people with the right skills into the right roles has never been more difficult,” says IDC researcher Gina Smith. Her prescription for success: Develop a “culture of learning.”

AI helps supply chains, too

Did you know AI is being used to solve supply-chain problems?

It’s a big issue. Over 8 in 10 global businesses (84%) said they’ve experienced supply-chain disruptions in the last year, finds a survey commissioned by Blue Yonder, a vendor of supply-chain solutions.

In response, supply-chain executives are making strategic investments in AI and sustainability, Blue Yonder finds. Nearly 8 in 10 organizations (79%) said they’ve increased their investments in supply-chain operations. Their 2 top areas of investment were sustainability (cited by 48%) and AI (41%).

The survey also identified the top supply-chain areas for AI investment. They are planning (cited by 56% of those investing in AI), transportation (53%) and order management (50%).

In addition, 8 in 10 respondents to the survey said they’ve implemented GenAI in their supply chains at some level. And more than 90% said GenAI has been effective in optimizing their supply chains and related decisions.

The survey, conducted by an independent research firm with sponsorship by Blue Yonder, was fielded in March, with the results released earlier this month. The survey received responses from more than 600 C-suite and senior executives, all of them employed by businesses or government agencies in the United States, UK and Europe.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Supermicro, Vast collaborate to deliver turnkey AI storage at rack scale

Featured content

Supermicro, Vast collaborate to deliver turnkey AI storage at rack scale

Supermicro and Vast Data are jointly offering an AMD-based turnkey solution that promises to simplify and accelerate AI and data pipelines.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Supermicro and Vast Data are collaborating to deliver a turnkey, full-stack solution for creating and expanding AI deployments.

This joint solution is aimed at hyperscalers, cloud service providers (CSPs) and large, data-centric enterprises in fintech, adtech, media and entertainment, chip design and high-performance computing (HPC).

Applications that can benefit from the new joint offering include enterprise NAS and object storage; high-performance data ingestion; supercomputer data access; scalable data analysis; and scalable data processing.

Vast, founded in 2016, offers a software data platform that enterprises and CSPs use for data-intensive computing. The platform is based on a distributed systems architecture, called DASE, that allows a system to run read and write operations at any scale. Vast’s customers include Pixar, Verizon and Zoom.

By collaborating with Supermicro, Vast hopes to extend its market. Currently, Vast sells to infrastructure providers at a variety of scales. Some of its largest customers have built 400 petabyte storage systems, and a few are even discussing systems that would store up to 2 exabytes, according to John Mao, Vast’s VP of technology alliances.

Supermicro and Vast have engaged with many of the same CSPs separately, supporting various parts of the solution. By formalizing this collaboration, they hope to extend their reach to new customers while increasing their sell-through to current customers.

Vast is also looking to the Supermicro alliance to expand its global reach. While most of Vast’s customers today are U.S.-based, Supermicro operates in over 100 countries worldwide. Supermicro also has the infrastructure to integrate, test and ship 5,000 fully populated racks per month from its manufacturing plants in California, Netherlands, Malaysia and Taiwan.

There’s also a big difference in size. Where privately held Vast has about 800 employees, publicly traded Supermicro has more than 5,100.

Rack solution

Now Vast and Supermicro have developed a new converged system using Supermicro’s Hyper A+ servers with AMD EPYC 9004 processors. The solution combines 2 separate Vast servers. 

This converged system is well suited to large service providers, where the typical Supermicro-powered Vast rack configuration will start at about 2PB, Mao adds.

Rack-scale configurations can cut costs by eliminating the need for single-box redundancy. This converged design makes the system more scalable and more cost-efficient.

Under the hood

One highlight of the joint project: It puts Vast’s DASE architecture on Supermicro’s  industry-standard servers. Each server will have both the compute and storage functions of a Vast cluster.

At the same time, the architecture is disaggregated via a high-speed Ethernet NVMe fabric. This allows each node to access all drives in the cluster.

The Vast platform architecture uses a series of what the company calls an EBox. Each EBox, in turn, contains 2 kinds of storage servers in a container environment: CNode (short for Compute Node) and DNode (short for Data Node). In a typical EBox, one CNode interfaces with client applications and writes directly to two DNode containers.

In this configuration, Supermicro’s storage servers can act as a hardware building block to scale Vast to hundreds of petabytes. It supports Vast’s requirement for multiple tiers of solid-state storage media, an approach that’s unique in the industry.

CPU to GPU

At the NAB Show, held recently in Las Vegas, Supermicro’s demos included storage servers, each powered by a single-socket AMD EPYC 9004 Series processor.

With up to 128 PCIe Gen 5 lanes, the AMD processor empowers the server to connect more SSDs via NVMe with a single CPU. The Supermicro storage server also lets users move data directly from storage to GPU memory supporting Nvidia’s GPU Direct storage protocol, essentially bypassing a GPU cluster’s CPU using RDMA.

If you or your customers are interested in the new Vast solution, get in touch with your local Supermicro sales rep or channel partner. Under the terms of the new partnership, Supermicro is acting as a Vast integrator and OEM. It’s also Vast’s only rack-scale partner.

Do more:

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Tech Explainer: What’s the difference between AI training and AI inference?

Featured content

Tech Explainer: What’s the difference between AI training and AI inference?

AI training and inference may be two sides of the same coin, but their compute needs can be quite different. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Artificial Intelligence (AI) training and inference are two sides of the same coin. Training is the process of teaching an AI model how to perform a given task. Inference is the AI model in action, drawing its own conclusions without human intervention.

Take a theoretical machine learning (ML) model designed to detect counterfeit one-dollar bills. During the training process, AI engineers would feed the model large data sets containing thousands, or even millions, of pictures. And tell the training application which are real and which are counterfeit.

Then inference could kick in. The AI model could be uploaded to retail locations, then run to detect bogus bills.

A deeper look at training

That’s the high level. Let’s dig in a bit deeper.

Continuing with our bogus-bill detecting workload, during training, the pictures fed to the AI model would include annotations telling the AI how to think about each piece of data.

For instance, the AI might see a picture of a dollar bill with an embedded annotation that essentially tells the model “this is legal tender.” The annotation could also identify characteristics of a genuine dollar, such as the minute details of the printed iconography and the correct number of characters in the bill’s serial number.

Engineers might also feed the AI model pictures of counterfeit bills. That way, the model could learn the tell-tale signs of a fake. These might include examples of incomplete printing, color discrepancies and missing watermarks.

On to inference

One the training is complete, inference can take over.

Still with our example of counterfeit detection, the AI model could now be uploaded to the cloud, then connected with thousands of point-of-sale (POS) devices in retail locations worldwide.

Retail workers would scan any bill they suspect might be fake. The machine learning model, in turn, would then assess the bill’s legitimacy.

This process of AI inference is autonomous. In other words, once the AI enters inference, it’s no longer getting help from engineers and app developers.

Using our example, during inference the AI system has reached the point where it can reliably discern both legal and counterfeit bills. And it can do so with a high enough success percentage to satisfy its human controllers.

Different needs

AI training and inference also have different technology requirements. Basically, training is far more resource-intensive. The focus is on achieving low-latency operation and brute force.

Training a large language model (LLM) chatbot like the popular ChatGPT often forces its underlying technology to contend with more than a trillion parameters. An AI parameter is a variable learned by the LLM during training. These parameters include configuration settings and components that define the LLM’s behavior.)

To meet these requirements, IT operations must deploy a system that can bring to bear raw computational power in a vast cluster.

By contrast, inference applications have different compute requirements. “Essentially, it’s, ‘I’ve trained my model, now I want to organize it,’” explained AMD executive VP and CTO Mark Papermaster in a recent virtual presentation.

AMD’s dual-processor solution

Inferencing workloads are both more concise and less demanding than those for training. Therefore, it makes sense to run them on more affordable GPU-CPU combination technology like the AMD Instinct MI300A.

The AMD Instinct MI300A is an accelerated processing unit (APU) that combines the facility of a standard AI accelerator with the efficiency of AMD EPYC processors. Both the CPU and GPU elements can share memory, dramatically enhancing efficiency, flexibility and programmability.

A single AMD MI300A APU packs 228 GPU compute units, 24 of AMD’s ‘Zen 4’ CPU cores, and 128GB of unified HBM3 memory. Compared with the previous-generation AMD MI250X accelerators, this translates to approximately 2.6x the workload performance per watt using FP32.

That’s a significant increase in performance. It’s likely to be repeated as AI infrastructure evolves along with the proliferation of AI applications that now power our world.

Do more:

 

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

Research Roundup: Tech leaders’ time, GenAI for HR, network security in the cloud, dangerous dating sites

Featured content

Research Roundup: Tech leaders’ time, GenAI for HR, network security in the cloud, dangerous dating sites

Catch up on the latest IT market research from MIT, Gartner, Dell’Oro Group and others. 

Learn More about this topic
  • Applications:

Tech leaders are spending more time with the channel. HR execs are getting serious about GenAI. Network security is moving to the cloud. And online dating sites can be dangerous.

That’s some of the latest and greatest from leading IT researchers. And here’s your Performance Intensive Computing roundup.

Tech leaders spend more time with the channel

C-level technology executives in 2022 spent 17% of their time working with external customers and channel partners, up from 10% of their time in 2007, according to a recent report from the MIT Center for Information Systems Research (CISR).

Good news, right? Well, conversely, these same tech leaders spent less time collaborating with their coworkers and working on their organizations’ technology stacks. Guess something had to give.

The report, published earlier this year, is the work of three MIT researchers. To compile the data, they reviewed surveys of CIOs, CTOs and CDOs conducted in 2007, 2016 and 2022.

Why the 7% increase in time spent with external customers and channel partners? According to the MIT researchers, it’s the “growing number of digital touchpoints.”

HR execs using GenAI

Nearly 4 in 10 HR executives (38%) are now piloting, planning to implement, or have already implemented generative AI, finds research firm Gartner. That’s up sharply from just 19% of HR execs as recently as last June.

The results come from a quick Gartner survey. This past Jan. 31, the firm polled nearly 180 HR execs.

One of the survey’s key findings: “More organizations are moving from exploring how GenAI might be used…to implementing solutions,” says Dion Love, a VP in Gartner’s HR practice.

Gartner’s January survey also found 3 top use cases for GenAI in HR:

  • HR service delivery: Of those working with GenAI, over 4 in 10 (43%) are using the technology for employee-facing chatbots.
  • HR operations: Nearly as many (42%) are working with GenAI for administrative tasks, policies and generating documents.
  • Recruiting: About the same percentage (41%) are working with GenAI for job descriptions and skills data.

Yet all this work is not leading to many new GenAI-related job roles. Over two-thirds of the respondents (67%) said they do not plan to add any GenAI-related roles to the HR function over the next 12 months.

Network security moving to the cloud

Sales of SaaS-based and virtual network-security solutions surged last year by 26%, reaching a global total of $9.6 billion. By contrast, the overall network-security market shrank by 1%.

That’s according to a report from Dell’Oro Group. It calls the move to network-security solutions in the cloud a “pivotal shift.”

Dell’Oro senior director Mauricio Sanchez goes even further. He calls the industry’s gravitation toward SaaS and virtual solutions “nothing short of revolutionary.”

Also, nearly $5 billion of that $9.6 billion market was due to a 30% rise in spending on SSE networks, Dell’Oro says. SSE, short for Security Service Edge, incorporates various services—including network service brokering, identity service brokering, and security as a service—in a single package.

Looking for love online? Be careful

Nearly 7 out of 10 online daters have been scammed while using dating sites. Some of the victims lost money; others risked their personal security.

That’s according to a survey conducted for ID Shield. The survey was limited, reaching only about 270 people. But all respondents had at least used a dating app in the last 3 years.

The survey’s key findings:

  • Financial loss: Six in 10 scam victims on dating sites lost more than $10,000 to the crooks. And slightly more (64%) disclosed personal and finance information that was later used against them.
  • ID theft: Nearly 7 in 10 respondents were asked to verify their identity to someone on the dating app. And nearly two-thirds (65%) divulged their Social Security numbers. 
  • Repeat users: You might think the victims would learn. But 93% of users who were scammed once on a dating app say they continue to use the same app. Let’s hope they’re at least being careful.

 

Featured videos


Events


Find AMD & Supermicro Elsewhere

AMD and Supermicro: Pioneering AI Solutions

Featured content

AMD and Supermicro: Pioneering AI Solutions

In the constantly evolving landscape of AI and machine learning, the synergy between hardware and software is paramount. Enter AMD and Supermicro, two industry titans who have joined forces to empower organizations in the new world of AI with cutting-edge solutions.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Bringing AMD Instinct to the Forefront

In the constantly evolving landscape of AI and machine learning, the synergy between hardware and software is paramount. Enter AMD and Supermicro, two industry titans who have joined forces to empower organizations in the new world of AI with cutting-edge solutions. Their shared vision? To enable organizations to unlock the full potential of AI workloads, from training massive language models to accelerating complex simulations.

The AMD Instinct MI300 Series: Changing The AI Acceleration Paradigm

At the heart of this collaboration lies the AMD Instinct MI300 Series—a family of accelerators designed to redefine performance boundaries. These accelerators combine high-performance AMD EPYC™ 9004 series CPUs with the powerful AMD InstinctTM MI300X GPU accelerators and 192GB of HBM3 memory, creating a formidable force for AI, HPC, and technical computing.

Supermicro’s H13 Generation of GPU Servers

Supermicro’s H13 generation of GPU Servers serves as the canvas for this technological masterpiece. Optimized for leading-edge performance and efficiency, these servers integrate seamlessly with the AMD Instinct MI300 Series. Let’s explore the highlights:

8-GPU Systems for Large-Scale AI Training:

  • Supermicro’s 8-GPU servers, equipped with the AMD Instinct MI300X OAM accelerator, offer raw acceleration power. The AMD Infinity Fabric™ Links enable up to 896GB/s of peak theoretical P2P I/O bandwidth, while the 1.5TB HBM3 GPU memory fuels large-scale AI models.
  • These servers are ideal for LLM Inference and training language models with trillions of parameters, minimizing training time and inference latency, lowering the TCO and maximizing throughput.

Benchmarking Excellence

But what about real-world performance? Fear not! Supermicro’s ongoing testing and benchmarking efforts have yielded remarkable results. The continued engagement between AMD and Supermicro performance teams enabled Supermicro to test pre-release ROCm versions with the latest performance optimizations and publicly released optimization like Flash Attention 2 and vLLM. The Supermicro AMD-based system AS -8125GS-TNMR2 showcases AI inference prowess, especially on models like Llama-2 70B, Llama-2 13B, and Bloom 176B. The performance? Equal to or better than AMD’s published results from the Dec. 6 Advancing AI event.

Image - Blog - AMD and Supermicro Pioneering AI Solutions

Charles Liang’s Vision

In the words of Charles Liang, President and CEO of Supermicro:

“We are very excited to expand our rack scale Total IT Solutions for AI training with the latest generation of AMD Instinct accelerators. Our proven architecture allows for fully integrated liquid cooling solutions, giving customers a competitive advantage.”

Conclusion

The AMD-Supermicro partnership isn’t just about hardware and software stacks; it’s about pushing boundaries, accelerating breakthroughs, and shaping the future of AI. So, as we raise our virtual glasses, let’s toast to innovation, collaboration, and the relentless pursuit of performance and excellence.

Featured videos


Events


Find AMD & Supermicro Elsewhere

Pages