A (Red) Rising Star in the Latest Top500 Supercomputer List

The 35th Top500 supercomputer list has just been released at ISC2010 in Germany, and it’s got some very interesting things in it.

Firstly, China has just got the #2 system on the Top500 with an nVidia GPU based cluster called Nebulae. At 1.271PF measured (Rmax) it’s just over 70% of the performance of the current #1 system Jaguar but (if you believe that it’s worth anything) on theoretical performance it’s 2.9PF beats out Jaguar’s 2.3PF – this means that if they can optimise Linpack some more for the architecture then perhaps they have a shot of overtaking Jaguar and taking #1 (assuming nothing larger comes along in the next 6 months).

Secondly China has also taken the #7 spot with an AMD GPU based cluster (notice a pattern here?) and now have 24 systems in the Top500 and have overtaken Germany to take the #2 spot in terms of total performance in a country at 2.9PF, though a long way behind the US with over 17PF total Rmax. I think the Chinese have arrived with a vengeance and I suspect they’re going to carry on boosting their capacity, especially as both their Top 10 systems are built by Chinese organisations.

Linux continues its domination of the Top500, increasing its share of systems from 446 (89.2%) in November 2009 to 455 (91%) today. Windows has just 5 systems in total, unchanged from last year. Tellingly it appears they are the same 5 as November as the stats are unchanged – it appears there may be stagnation in the uptake of Windows HPC at the high end.

Australia has just one system in the Top500, the Bureau of Meteorology / HPCCC Sun^WOracle cluster in Melbourne. It’s ranked at #113 with 49.5TF at which is pretty impressive. though I’m puzzled why its much bigger sibling at NCI/ANU in Canberra didn’t get a mention, perhaps they chose to just get it into production ASAP without faffing around with Linpack ? Based on their estimated Rpeak and the efficiency of the BoM machine I reckon they’d get an Rmax of about 128TF and would place about #35.

But without the NCI machine Australia ranks behind such well known HPC countries as Austria and Denmark and well behind the likes of New Zealand and India!

VLSCI Mid Year Call for Applications from Victorian Life Science Researchers

A quick work related blog..

Today VLSCI announced its mid-year Call For Applications for use of the Peak Computing Facility at the University of Melbourne by life science researchers in Victoria. This includes time on our forthcoming IBM Blue Gene/P HPC system as well as the existing SGI Altix XE HPC cluster and a forthcoming IBM iDataPlex HPC cluster (both Intel Nehalem systems).

Pass it on!

Portable Hardware Locality (hwloc) Library v1.0 Released

One of the things that us HPC folks tend to get hot under the collar about is hardware locality, basically making sure that your memory accesses are as fast as possible by optimising where on the system you’re getting memory from and making sure your process doesn’t get moved further away. Just binding your processes to the cores they are on can make for a significant speed up so it’s well worth doing. If you’ve just got a single socket, or a pre-Nehalem Intel x86 system then your path to RAM has been pretty much identical wherever you are so the only benefits are from not moving away from your CPU cache lines but on AMD Opteron, Nehalem, Itanic, Alpha, etc you really should care a lot more about locality for best performance.

The open source Torque queuing system (which I help out with) does some of this already, if you compile it with –enable-cpuset and have the /dev/cpuset virtual filesystem mounted then before it starts a job on a node it will create a cpuset for that (based on what cores have been allocated on the node) and then put the HPC processes into that cpuset. If you’re using Open-MPI 1.4.x and have the environment variable OMPI_MCA_orte_process_binding set to core then each of the MPI ranks will bind itself to one of the cores within that cpuset.

All good ? Well not quite as Torque is reliant on /dev/cpuset being there and being able to parse the contents of it and Open-MPI 1.4.x uses the Portable Linux Process Affinity (PLPA) library which, as its name suggests, is only for Linux. So the good Open-MPI people looked at their PLPA library and decided it needed extending and teamed up with the INRIA libtopology team who were working on how you discover the topology of various architectures and decided to merge the two projects together under the banner of the Portable Hardware Locality (hwloc) library.

The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, …) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.

The portable bit of the name comes from the fact that it works on Linux, Solaris, AIX, Darwin, FreeBSD, Tru64, HP-UX and Windows (though with limitations on some architectures – e.g. Windows – which don’t expose all the info it needs) and can extended for other OS’s if people feel they need to scratch that itch (OpenVMS anyone?). This release is also embeddable into projects (such as Open-MPI 1.5) and I have an interest in Torque picking it up to improve and extend its cpuset support.

Patents, MPEG-LA and Not-So-Professional Video Cameras

So you’ve bought a nice new professional video camera and you want to shoot a video of a friends band so they can sell a couple of copies to buy a new guitar, simple eh ? Well not quite, you’ll probably want to check the license for the camera according to this article by Eugenia Loli-Queru:

You see, there is something very important, that the vast majority of both consumers and video professionals don’t know: ALL modern video cameras and camcorders that shoot in h.264 or mpeg2, come with a license agreement that says that you can only use that camera to shoot video for “personal use and non-commercial” purposes (go on, read your manuals).

Now, you may ask, this can’t be right, can it ? Surely a “professional” video camera should be able to be used for professional purposes ? Well yes, it should, but it can’t. The reason is (of course) software patents, according to Eugenia:

Apparently, MPEG-LA makes it difficult for camera manufacturers, or video editor software houses, to obtain a cheap-enough license that allows their users to use their codec any way they want!

So the camera manufacturers pass that onto the purchaser, if you buy one and want to use it professionally then you will have to get your own license from MPEG-LA and then pay them a royalty on every copy sold. Sadly you can’t even get away from this by transcoding your MPEG2 or H.264 video into a free format for two reasons, firstly the camera most likely uses it internally first (and that’s apparently enough) and secondly the MPEG-LA claim their patent portfolio is so broad that you cannot create a video codec these days without infringing one of their patents. So theoretically you’d need to pay no matter what you did.

Eugenia does offer one possible way out, the ancient MJPEG format:

Let me make one thing clear. MJPEG **sucks** as a codec. It’s very old and inefficient. OGV Theora looks like alien technology compared to it. But (all, if not most of) its patents have expired. And JPEG is old enough to predate MPEG-LA. Thankfully, there are still some MJPEG HD cameras in the market, although they are getting fewer and fewer: Nikon’s dSLRs, Pentax’s new dSLRs, and the previous generation of Panasonic’s HD digicams. Other cameras that might be more acceptable to use codec-wise are the Panasonic HVX-200 (DVCPro HD codec, $6000), the SILICON IMAGING SI-2K (using the intermediate format Cineform to record, costs $12,000), and the RED One (using the R3D intermediate format, costs $16,000+). Almost every other HD camera in the market is unsuitable, if you want to be in the clear 100%

Yet another reason why software patents need to be defeated, they stifle what we can do with the technology we have paid for.