Last August PGI announced an update to its “PGI OpenCL Compiler for ARM” (PGCL 12.7), but if you go looking for that on the PGI news page you won’t find it. In fact if you go to their products page and go to the link for the “PGI OpenCL Compiler for ARM” you’ll find it’s gone too..
For the record that part of the products page currently looks like:
PGI Compilers and Tools for Mobile and Embedded Platforms
PGI OpenCL Compiler for ARM
PGCLâ„¢ is an OpenCLâ„¢ framework for compiling and running OpenCL 1.1 embedded profile
applications on the ST-Ericsson NovaThorâ„¢ U8500 and follow-on platforms using a single
ARM core as the OpenCL host and multiple ARM cores as an OpenCL computing device
The interesting thing is that this has happened recently, Google’s cache of the news page (dated July 25th 2013) still has the announcement listed:
The Portland Group Updates its OpenCL Compiler for Multi-core ARM
August 21, 2012
Latest PGCL includes automatic generation of NEON/SIMD instructions
The Portland GroupÂ® (PGI), a wholly-owned subsidiary of STMicroelectronics and the leading
independent supplier of compilers and tools for high-performance computing, today announced
the release of PGCL 12.7. PGCLâ„¢ is the PGI OpenCL framework for multi-core ARM-based
Systems-on-Chips (SoCs), currently available on ST-Ericsson NovaThorâ„¢ platforms. PGCL includes
a PGI OpenCL compiler for multi-core ARM CPUs as a compute device and complements OpenCL
So something changed in the last week, oddly around the same time that nVidia announced it was buying PGI..
The topic for the December 2012 Equal Writes group at Belgrave was “The Earth Moved”, for which I wrote this poem:
The Earth Moved
Short, brief shakes of quakes
Bring people onto the streets
Waves to the shore
Fear, panic, sadness
Slow, imperceptable movement
Plates grind against each other
One pushing the other down
To be recycled
Magma from melted rock reborn
pushes up through the crust
Building new land
Elsewhere two plates collide head on
Neither able to give way
The pressure becomes to much
Land buckles into mountains
Rain falls on far hills
Each drop excavating tiny holes
On rock that seems immutable
Wearing down over ages
In a few billion years
The Sun will grow large and red
The Earth will boil away
and move out into the Universe
Our small pale blue dot gone
and all it ever contained
Star stuff returning to space
eventually, maybe, to be reborn
If you’ve been using SpamAssassin and have been reporting to SpamCop then you’ll have found overnight that you got a heap of bounces back saying things like:
<firstname.lastname@example.org> (expanded from
<email@example.com>): unknown user: "devnull"
It turns out that the firstname.lastname@example.org appears to be something that the SpamAssassin developers set without consulting with SpamCop, and SpamCop have just been blackholing those reports for an unknown amount of time. Last night it went away and so now IronPort are rejecting them which was how I learnt of this. I’m not impressed by what the SA developers did her, it should have required you to put in a registered SpamCop address and not reported if that wasn’t set.
I’ve disabled my SpamCop reporting by commenting out this line in
/etc/mail/spamassassin/v310.pre on my Debian mailserver:
If you use SpamAssassin and don’t have a registered SpamCop account you’ll want to do the same.
It looks like our kitten (who is rapidly turning into a cat) has discovered that our clothes airer is perfect for practising gymnastics on..
I wonder how long it will last!
It’d been a while since I’d last told Digikam to scan my collection for faces, and having just upgraded to 3.2.0 I thought it was about time to have another shot at it. However, I’d noticed it was taking an awful long time and seemed to only be using one of the eight cores on this system (Ivy Bridge i7-3770K running Kubuntu 13.04) so I thought I’d see if simply taking advantage of OpenMP could improve things with multithreading.
To do that I just started a new
konsole and (as a first step) told OpenMP to use all the cores with:
Running digikam from that session and starting a face scan showed that yes, it was using all 8 cores, but not really to a great amount. Running
iotop showed it doing about 5MB/s in reads and
latencytop showed that it was spending most of its time in fsync(). Now that’s good, because it’s making sure that the data has really hit the rust to ensure everything is consistent.
However, in this case I can rebuild the entire face database should I need to, and I have about 66GB of photos to scan, plus I wanted to see just how fast this could go. 😉 So now it’s time to get a little dangerous and try Stewart Smith’s wonderfully named “libeatmydata” library which gives you a library (surprise surprise) and helper program that lets you preload an fsync() function that really only does
return(0); (which, you may be interested to know, is still POSIX compliant).
So to test that out I just needed to do:
and suddenly I had 8 cores running flat out.
iotop showed that Digikam was now doing about 25-30MB/s reads and
latencytop showed most of its time waiting for things was now for user space lock contention, i.e. locks protecting shared data structures to stop threads from stomping on each other and going off into the weeds. Interestingly the disks are a lot quieter than before too. Oh, and it’s screaming through the photos now. 🙂
WARNING: Do not use eatmydata for anything you care about, it will do just what it says in the name should your power die, system hang, universe end, etc..