We’ve come a long way on the development of high performance computing. Back in 2004, I attended an event held in the gym at the University of San Francisco. The goal was to crowd-source computing power by connecting the PCs of volunteers who were participating in the first “Flash Mob Computing” cluster computing event. Several hundred PCs were networked together in the hope that they would create one of the largest supercomputers, albeit for a few hours.
I brought two laptops for the cause. The participation rules stated that the data on our hard drives would remain intact. Each computer would run a specially crafted boot CD that ran a benchmark called Linpack, a software library for performing numerical linear algebra running on Linux. It was used to measure the collective computing power.
The event attracted people with water-cooled overclocked PCs, naked PCs (no cases, just the boards and other components) and custom-made rigs with fancy cases. After a few hours, we had roughly 650 PCs on the floor of the gym. Each PC was connected to a bunch of Foundry BigIron super-switches that were located around the room.
The 2004 experiment brought out several industry luminaries, such as Gordon Bell, who was the father of the Digital Equipment Corporation VAX minicomputer, and Jim Gray, who was one of the original designers behind the TPC benchmark while he was at Tandem. Both men at the time were Microsoft fellows. Bell was carrying his own laptop but had forgotten to bring his CD drive, so he couldn’t connect to the mob.
What was most interesting to me, and what gave rise to the mob’s eventual undoing, were the networking issues involved with assembling and running such a huge collection of gear. The mob used ordinary 100BaseT Ethernet, which was a double-edged sword. While easy to set up, it was difficult to debug when network problems arose. The Linpack benchmark requires all the component machines to be running concurrently during the test, and the organizers had trouble getting all 600-plus PCs to operate online flawlessly. The best benchmark accomplished was a peak rate of 180 gigaflops using 256 computers, but that wasn’t an official score as one node failed during the test.
To give you an idea of where this stood in terms of overall supercomputing prowess, it was better than the Cray supercomputers of the early 1990s, which delivered around 16 gigaflops.If you lo
At the website top500.org (which tracks the fastest supercomputers around the globe), you can see that all the current top 500 machines are measured in petaflops (1 million gigaflops). The Oak Ridge National Laboratory’s Frontier machine, which has occupied the number one spot this year, weighs in at more than 1,000 petaflops and uses 8 million cores. To make the fastest 500 list back in 2004, the mob would have had to achieve a benchmark of over 600 gigaflops. Because of the networking problems, we’ll never know for sure.Still, it was an impressive achievement, given the motley mix of machines. All of the world’s top 500 supercomputers are custom built and carefully curated and assembled to attain that level of computing performance.
Another historical note: back in 2004, one of the more interesting entries came in third on the top500.org list: a collection of several thousand Apple Macintoshes running at Virginia Polytechnic University. Back in the present, as you might imagine, almost all the fastest 500 supercomputers are based on a combination of CPU and GPU chip architectures.
Today, you can buy your own supercomputer on the retail market, such as the Supermicro SuperBlade® models. And of course, you can routinely run much faster networking protocols than 100-megabit Ethernet.