Performance Intensive Computing

Capture the full potential of IT

AMD and Supermicro Sponsor Two Fastest Linpack Scores at SC22’s Student Cluster Competition

Featured content

AMD and Supermicro Sponsor Two Fastest Linpack Scores at SC22’s Student Cluster Competition

The Student Cluster Computing challenge made its 16th appearance at the SuperComputer 22 (SC22) event in Dallas. The two student teams that were running AMD EPYC™ CPUs and AMD Instinct™ GPUs were the two teams that aced the Linpack benchmark. That's the test used to determined the TOP500 supercomputers in the world.

Applications:
Featured Technologies:

Last month, the annual Supercomputing Conference 2022 (SC22) was held in Dallas. The Student Cluster Competition (SCC), which began in 2007, was also performed again. The SCC offers an immersive high-performance computing (HPC) experience to undergraduate and high school students.

According to the SC22 website: Student teams design and build small clusters, learn scientific applications, apply optimization techniques for their chosen architectures and compete in a non-stop, 48-hour challenge at the SC conference to complete real-world scientific workloads, showing off their HPC knowledge for conference attendees and judges.

Each team has six students, at least one faculty advisor, a sutdent team leader, and is associated with vendor sponsors, which provide the equipment. AMD and Supermicro jointly sponsored both the Massachusetts Green Team from MIT, Boston University and Northeastern University and the 2MuchCache team from UC San Diego (UCSD) and the San Diego Supercomputer Center (SDSC). Running AMD EPYC™ CPUs and AMD Instinct™-based GPUs supplied by AMD and Supermicro, the two teams came in first and second in the SCC Linpack test.

The Linpack benchmarks measure a system's floating-point computing power, according to Wikipedia. The latest version of these benchmarks is used to determine the TOP500 list, ranks the world's most powerful supercomputers.

In addition to chasing high scores on benchmarks, the teams must operate their systems without exceeding a power limit. For 2022, the competition used a variable power limit: at times, the power available to each team for its competition hardware was as high as 4000-watts (but was usually lower) and at times it was as low as 1500-watts (but was usually higher).

The “2MuchCache” team offers a poster page with extensive detail about their competition hardware. They used two third-generation AMD EPYC™ 7773X CPUs with 64 cores, 128 threads and 768MB of stacked-die cache. Team 2MuchCache used one AS-4124GQ-TNMI system with four AMD Instinct™ MI250 GPUs with 53 simultaneous threads.

The “Green Team’s” poster page also boasts two instances of third-generation AMD 7003-series EPYC™ processors, AMD Instinct™ 1210 GPUs with AMD Infinity fabric. The Green Team utilized two Supermicro AS-4124GS-TNR GPU systems.

The Students of 2MuchCache:

Longtian Bao, role: Lead for Data Centric Python, Co-lead for HPCG

Stefanie Dao, role: Lead for PHASTA, Co-lead for HPL

Michael Granado, role: Lead for HPCG, Co-lead for PHASTA

Yuchen Jing, role: Lead for IO500, Co-lead for Data Centric Python

Davit Margarian, role: Lead for HPL, Co-lead for LAMMPS

Matthew Mikhailov Major, role: Team Lead, Lead for LAMMPS, Co-lead for IO500

The Students of Green Team:

Po Hao Chen, roles: Team leader, theory & HPC, benchmarks, reproducibility

Carlton Knox, roles: Computer Arch., Benchmarks, Hardware

Andrew Nguyen, roles: Compilers & OS, GPUs, LAMMPS, Hardware

Vance Raiti, roles: Mathematics, Computer Arch., PHASTA

Yida Wang, roles: ML & HPC, Reproducibility

Yiran Yin, roles: Mathematics, HPC, PHASTA

Congratulations to both teams!

Featured videos

Events

Find AMD & Supermicro Elsewhere

Perspective: Don’t Back into Performance-Intensive Computing

Featured content

Perspective: Don’t Back into Performance-Intensive Computing

To compete in the marketplace, enterprises are increasingly employing performance-intensive tools and applications like machine learning, artificial intelligence, data-driven insights and automation to differentiate their products and services. In doing so, they may be unintentionally backing into performance-intensive computing because these technologies are computationally and/or data intensive.

Applications:

To compete in the marketplace, enterprises are increasingly employing performance-intensive tools and applications like machine learning, artificial intelligence, data-driven insights and decision-support analytics, technical computing, big data, modeling and simulation, cryptocurrency and other blockchain applications, automation and high-performance computing to differentiate their products and services.

In doing so, they may be unintentionally backing into performance-intensive computing because these technologies are computationally and/or data intensive. Without thinking through the compute performance you need as measured against your most demanding workloads – now and at least two years from now – you’re setting yourself up for failure or unnecessary expense. When it comes to performance-intensive computing: plan, don’t dabble.

There are questions you should ask before jumping in, too. In the cloud or on-premises? There are pluses and minuses to each. Is your data highly distributed? If so, you’ll need network services that won’t become a bottleneck. There’s a long list of environmental and technology needs that are required to make performance-intensive computing pay off. Among them is making it possible to scale. And, of course, planning and building out your environment in advance of your need is vastly preferable to stumbling into it.

The requirement that sometimes gets short shrift is organizational. Ultimately, this is about revealing data with which your company can make strategic decisions. There’s no longer anything mundane about enterprise technology and especially the data it manages. It has become so important that virtually every department in your company affects and is affected by it. If you double down on computational performance, the C-suite needs to be fully represented in how you use that power, not just the approval process. Leaving top leadership, marketing, finance, tax, design, manufacturing, HR or IT out of the picture would be a mistake. And those are just sample company building blocks. You also need measurable, meaningful metrics that will help your people determine the ROI of your efforts. Even so, it’s people who make the leap of faith that turns data into ideas.

Finally, if you don’t already have the expertise on staff to learn the ins and outs of this endeavor, hire or contract or enter into a consulting arrangement with smart people who clearly have the chops to do this right. You don’t want to be the company with a rocket ship that no one can fly.

So, don’t back into performance-intensive computing. But don’t back out of it either. Being able to take full advantage of your data at scale can play an important role in ensuring the viability of your company going forward.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Some Key Drivers behind AMD’s Plans for Future EPYC™ CPUs

Featured content

Some Key Drivers behind AMD’s Plans for Future EPYC™ CPUs

A video discussion between Charles Liang, Supermicro CEO, and Dr. Lisa Su, AMD CEO.

Applications:
Featured Technologies:

Higher clock rates, more cores and larger onboard memory caches are some of the traditional areas of improvement for generational CPU upgrades. Performance improvements are almost a given with a new generation CPU. Increasingly, howeer, the more difficult challenges for data centers and performance-intensive computing are energy efficiency and managing heat. Energy costs have spiked in many parts of the world and “performance per watt” is what many companies are looking for. AMD’s 4^th-gen EPYC™ CPU runs a little hotter than its predecessor, but its performance gains far outpace the thermal rise, making for much greater performance per watt. It’s a trade-off that makes sense, especially for performance-intensive computing, such HPC and technical computing applications.

In addition to the energy efficiency and heat dissipation concerns, Dr. Su and Mr. Liang discuss the importance of the AMD EPYC™ roadmap. You’ll learn one or two nuances about AMD’s plans. SMC is ready with 15 products that leverage the Genoa, AMD’s fourth generation EPYC™ CPU. This under 15-minute video recorded on November 15, 2022, will bring you up to date on all things AMD EPYC™. Click the link to see the video:

Supermicro & AMD CEOs Video – The Future of Data Center Computing

Featured videos

Events

Find AMD & Supermicro Elsewhere

Locating Where to Drill for Oil in Deep Waters with Supermicro SuperServers® and AMD EPYC™ CPUs

Featured content

Locating Where to Drill for Oil in Deep Waters with Supermicro SuperServers® and AMD EPYC™ CPUs

Energy company Petrobas, based in Brazil, is using high-performance computing techniques to aid it in its oil and gas exploration, especially in deep-water situations. Petrobas used system integrator Atos to provide more than 250 Supermicro SuperServers. The cluster is ranked 33 on the current top500 list and goes by the name Pegaso.

Applications:
Featured Technologies:
Featured Companies:
Atos

Brazilian energy company Petrobas is using high-performance computing techniques to aid it in its oil and gas exploration, especially in deep-water situations. These techniques can help reduce costs and make finding and extracting new hydrocarbon deposits quicker. Petrobras' geoscientists and software engineers quickly modify algorithms to take advantage of new capabilities as new CPU and GPU technologies become available.

The energy company used system integrator Atos to provide more than 250 Supermicro SuperServer AS-4124GO-NART+ servers running dual AMD EPYC™ 7512 processors. The cluster goes by the name Pegaso (which in Portuguese means the mythological horse Pegasus) and is currently listed at number 33 on the top500 list of fastest computing systems. Atos is a global leader in digital transformation with 112,000 world-wide employees. They have built other systems that appeared on the top500 list, and AMD powers 38 of them.

Petrobas has had three other systems listed on previous iterations of the Top500 list, using other processors. Pegaso is now the largest supercomputer in South America. It is expected to become fully operational next month. Each of its servers runs CentOS and has 2TB of memory, for a total of 678TB. The cluster contains more than 230,000 core processors, is running more than 2,000 GPUs and is connected via an InfiniBand HDR networking system running at 400Gb/s. To give you an idea of how much gear is involved with Pegaso, it took more than 30 truckloads to deliver and consists of over 30 tons of hardware.

The geophysics team has a series of applications that require all this computing power, including seismic acquisition apps that collect data and is then processed to deliver high-resolution subsurface imaging to precisely locate the oil and gas deposits. Having the GPU accelerators in the cluster helps to reduce the processing time, so that the drilling teams can locate their rigs more precisely.

For more information, see this case study about Pegaso.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Choosing the Right AI Infrastructure for Your Needs

Featured content

Choosing the Right AI Infrastructure for Your Needs

AI architecture must scale effectively without sacrificing cost efficiency. One size does not fit all.

Applications:
Featured Technologies:

Building an agile, cost-effective environment that delivers on a company’s present and long-term AI strategies can be a challenge, and the impact of decisions made around that architecture will have an outsized effect on performance.

“AI capabilities are probably going to be 10%-15% of the entire infrastructure,” says Ashish Nadkarni, IDC group vice president and general manager, infrastructure systems, platforms and technologies. “But the amount the business relies on that infrastructure, the dependence on it, will be much higher. If that 15% doesn’t behave in the way that is expected, the business will suffer.”

Experts like Nadkarni note that companies can, and should, avail themselves of cloud-based options to test and ramp up AI capabilities. But as workloads increase over time, the costs associated with cloud computing can rise significantly, especially when workloads scale or the enterprise expands its usage, making on-premises architecture a valid alternative worth consideration.

No matter the industry, to build a robust and effective AI infrastructure, companies must first accurately diagnose their AI needs. What business challenges are they trying to solve? What forms of high-performance computing power can deliver solutions? What type of training is required to deliver the right insights from data? And what’s the most cost-effective way for a company to support AI workloads at scale and over time? Cloud may be the answer to get started, but for many companies on-prem solutions are viable alternatives.

“It’s a matter of finding the right configuration that delivers optimal performance for [your] workloads,” says Michael McNerney, vice president of marketing and network security at Supermicro, a leading provider of AI-capable, high-performance servers, management software and storage systems. “How big is your natural language processing or computer vision model, for example? Do you need a massive cluster for AI training? How critical is it to have the lowest latency possible for your AI inferencing? If the enterprise does not have massive models, does it move down the stack into smaller models to optimize infrastructure and cost on the AI side as well as in compute, storage and networking?”

Get perspective on these and other questions about selecting the right AI infrastructure for your business in the Nov. 20, 2022, Wall Street Journal paid program article:

Investing in Infrastructure

Featured videos

Events

Find AMD & Supermicro Elsewhere

Supermicro H13 Servers Maximize Your High-Performance Data Center

Featured content

Supermicro H13 Servers Maximize Your High-Performance Data Center

Applications:
Featured Technologies:
Featured Companies:
AMD

The modern data center must be both highly performant and energy efficient. Massive amounts of data are generated at the edge and then analyzed in the data center. New CPU technologies are constantly being developed that can analyze data, determine the best course of action, and speed up the time to understand the world around us and make better decisions.

With the digital transformation continuing, a wide range of data acquisition, storage and computing systems continue to evolve with each generation of a CPU. The latest CPU generations continue to innovate within their core computational units and in the technology to communicate with memory, storage devices, networking and accelerators.

Servers and, by default, the CPUs within those servers, form a continuum of computing and I/O power. The combination of cores, clock rates, memory access, path width and performance contribute to specific servers for workloads. In addition, the server that houses the CPUs may take different form factors and be used when the environment where the server is placed has airflow or power restrictions. The key for a server manufacturer to be able to address a wide range of applications is to use a building block approach to designing new systems. In this way, a range of systems can be simultaneously released in many form factors, each tailored to the operating environment.

The new H13 Supermicro product line, based on 4th Generation AMD EPYC™ CPUs, supports a broad spectrum of workloads and excels at helping a business achieve its goals.

Get speeds, feeds and other specs on Supermicro’s latest line-up of servers

Featured videos

Events

Find AMD & Supermicro Elsewhere

Manage Your HPC Resources with Supermicro's SuperCloud Composer

Featured content

Manage Your HPC Resources with Supermicro's SuperCloud Composer

Applications:
Featured Technologies:
Featured Companies:
GigaIO

Today’s data center has numerous challenges: provisioning hardware and cloud workloads, balancing the needs of performance-intensive applications across compute, storage and network resources, and having a consistent monitoring and analytics framework to feed intelligent systems management. Plus, you may have the need to deploy or re-deploy all these resources as needs shift, moment to moment.

Supermicro has created its own tool to assist with these decisions to monitor and manage this broad IT portfolio, called the SuperCloud Composer (SCC). It combines a standardized web-based interface using an Open Distributed Infrastructure Management interface with a unified dashboard based on the RedFish message bus and service agents.

SCC can track the various resources and assign them to different pools with its own predictive analytics and telemetry. It delivers a single intelligent management solution that covers both existing on-premises IT equipment as well as a more software-defined cloud collection. Additional details can be found in this SuperCloud Composer white paper.

SuperCloud Composer makes the use of a cluster-level PCIe network using the FabreX software from GigaIO Networks. It has the capability to flexibly scale up and out storage systems while using the lowest latency paths available.

It also supports Weka.IO cluster members, which can be deployed across multiple systems simultaneously. See our story The Perfect Combination: The Weka Next-Gen File System, Supermicro A+ Servers and AMD EPYC™ CPUs.

SCC can create automated installation playbooks in Ansible, including a software boot image repository that can quickly deploy new images across the server infrastructure. It has a fast-deploy feature that allows a new image to be deployed within seconds.

SuperCloud Composer offers a robust analytics engine that collects historical and up-to-date analytics stored in an indexed database within its framework. This data can produce a variety of charts, graphs and tables so that users can better visualize what is happening with their server resources. Each end-user is provided with analytic capable charting represented by IOPS, network, telemetry, thermal, power, composed node status, storage allocation and system status.

Last but not least, SCC also has both network provisioning and storage fabric provisioning features where build plans are pushed to data or fabric switches either as single-threaded or multithreaded operations, such that multiple switches can be updated simultaneously by shared or unique build plan templates.

For more information, watch this short SCC explainer video. Or schedule an online demo of SCC and request a free 90-day trial of the software.

Featured videos

Events

Find AMD & Supermicro Elsewhere

Supermicro Debuts New H13 Server Solutions Using AMD’s 4th-Gen EPYC™ CPUs

Featured content

Supermicro Debuts New H13 Server Solutions Using AMD’s 4th-Gen EPYC™ CPUs

Applications:
Featured Technologies:

Last week, Supermicro announced its new H13 A+ server solutions, featuring the latest fourth-generation AMD EPYC™ processors. The new AMD “Genoa”-class Supermicro A+ configurations will be able to handle up to 96 Zen4 CPU cores running up to 6TB of 12-channel DDR5 memory, using a separate channel for each stick of memory.

The various systems are designed to support the highest performance-intensive computing workloads over a wide range of storage, networking and I/O configuration options. They also feature tool-less chassis and hot-swappable modules for easier access to internal parts as well as I/O drive trays on both front and rear panels. All the new equipment can handle a range of power conditions, including 120 to 480 AC volt operation and 48 DC power attachments.

The new H13 systems have been optimized for AI, machine learning and complex calculation tasks for data analytics and other kinds of HPC applications. Supermicro’s 4^th-Gen AMD EPYC™ systems employ the latest PCIe 5.0 connectivity throughout their layouts to speed data flows and provide high network and cluster internetworking performance. At the heart of these systems is the AMD EPYC™ 9004 series CPUs, which were also announced last week.

The Supermicro H13 GrandTwin® systems can handle up to six SATA3 or NVMe drive bays, which are hot-pluggable. The H13 CloudDC systems come in 1U and 2U chassis that are designed for cloud-based workloads and data centers that can handle up to 12 hot-swappable drive bays and support the Open Compute Platform I/O modules. Supermicro has also announced its H13 Hyper configuration for dual-socketed systems. All of the twin-socket server configurations support 160 PCIe 5.0 data lanes.

There are several GPU-intensive configurations for another series of both 4U and 8U sized servers that can support up to 10 GPU PCIe accelerator cards, including the latest graphic processors from AMD and Nvidia. The 4U family of servers support both AMD Infinity Fabric Link and NVIDIA NVLink Bridge technologies so users can choose the right balance of computation, acceleration, I/O and local storage specifications.

To get a deep dive on H13 products, including speeds, feeds and specs, download this whitepaper from the Supermicro site: Supermicro H13 Servers Enable High-Performance Data Centers.

Featured videos

Events

Find AMD & Supermicro Elsewhere

How the New EPYC CPUs Deliver System-on-Chip Electronics

Featured content

How the New EPYC CPUs Deliver System-on-Chip Electronics

CPU chipsets are not normally considered systems-on-chip (SoC) but the fourth generation of AMD EPYC processors incorporate numerous I/O functionality at a high level of integration.

Applications:
Featured Technologies:

Typically, CPU chipsets are not normally considered systems-on-chip (SoC) but the fourth generation of AMD EPYC processors incorporate numerous I/O functionality at a high level of integration. Previous generations have delivered this functionality on external chipsets. The SoC design helps reduce power consumption, packaging costs and improve data throughput by reducing interconnection latencies.

The new EPYC processors have 12 DDR5 memory controllers – 50 percent more controllers than any other x86 CPU, which keeps up the higher memory demands of performance-intensive computing applications. As we mentioned in an earlier blog, these controllers also include inline encryption engines for supporting AMD’s Infinity Guard features, including support for an integrated security processor that establishes a secure root of trust and other security tasks.

They also include 128 or 160 lanes of PCIe Gen5 controllers, which also helps with higher I/O throughput of these more demanding applications. These support the same physical interfaces for Infinity fabric connectors and provide more remote memory access among CPUs at up to 36 GBps between servers. The new Zen 4 CPU cores can make use of one or two interfaces.

The PCIe Gen 5 I/O is supported in the I/O die with eight serializer/deserializer silicon controllers with one independent set of traces to support each port of 16 PCIe lanes.

[See previous part - 3 of 4]

Featured videos

Events

Find AMD & Supermicro Elsewhere

AMD’s Infinity Guard Selected by Google Cloud for Confidential Computing

Featured content

AMD’s Infinity Guard Selected by Google Cloud for Confidential Computing

Google Cloud has been working over the past several years with AMD on developing new on-chip security protocols. More on the release of the AMD EPYC™ 9004 series processors in this part three of a four-part series..

Applications:
Featured Technologies:

Google Cloud has been working over the past several years with AMD on developing new on-chip security protocols that have seen further innovation with the release of the AMD EPYC™ 9004 series processors. These have a direct benefit for performance-intensive computing applications, particularly for supporting higher-density virtual machines (VMs) and using technologies that can protect data flows from leaving the confines of what Google calls confidential VMs as well as further isolating VM hypervisors. They offer a collection of N2D and C2D instances that support these confidential VMs.

“Product security is always our top focus,” said AMD CTO Mark Papermaster. “We are continuously investing and collaborating in the security of these technologies.”

Royal Hansen, VP of engineering for Google Cloud said: “Our customers expect the most trustworthy computing experience on the planet. Google and AMD have a long history and a variety of relationships with the deepest experts on security and chip development. This was at the core of our going to market with AMD’s security solutions for datacenters.”

The two companies also worked together on this security analysis.

Called Infinity Guard collectively, the security technologies theyv'e been working on involve four initiatives:

1. Secure encrypted virtualization provides each VM with its own unique encryption key known only to the processor.

2. Secure nested paging complements this virtualization to protect each VM from any malicious hypervisor attacks and provide for an isolated and trusted environment.

3. AMD’s secure boot along with the Trusted Platform Module attestation of the confidential VMs happen every time a VM boots, ensuring its integrity and to mitigate any persistent threats.

4. AMD’s secure memory encryption and integration into the memory channels speed performance.

These technologies are combined and communicate using the AMD Infinity Fabric pathways to deliver breakthrough performance along with better secure communications.

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Featured content

Featured videos

Events

Find AMD & Supermicro Elsewhere

Related Content

Pages