Research Challenges in Astronomy

I’ve been at the first APAC All Hands Meeting this week, generally hearing what all the other people in the APAC Grid project are up to and meeting folks from around the country that I only otherwise get to see via Access Grid.

Today was the turn of some of the science areas to tell us what they are up to and what they see their big challenges being, and the most scariest (from an HPC perspective) was the session on Astronomy and Astrophysics by Peter Quinn (formerly of the ESO and now a Premier’s Fellow at UWA).

The most intimidating points I picked up from his presentation were:

  • Data explosion – doubling time T2 was < 12 months, with new big survey projects such as VST and VISTA that will become T2 < 6 months!
  • Disk technology T2 is 10 years at present (according to Peter), and slowing.
  • The Large Synoptic Survey Telescope is reckoned to capable of producing 7 PetaBytes per annum of data.
  • The ESO’s data archive is currently (2006) 100TB in 10 racks and using 70kW of power. By 2012 it is forecast to be 10PB in 100 racks and consuming 1MW of electricity.
  • A recent Epoch of Reionisation simulation of 5,0003 particles on a 1,000 CPU Opteron cluster used 2 months of CPU time and 10TB physical RAM (about 10GB per core) and produced about 100TB of output data.
  • Catalogue sizes are exploding, in 2000 there were about 100,000 galaxies in a catalogue, by 2010 that will be 1 billion.
  • Algorithms are not scaling with these data sizes – an algorithm that took 1 second in 2000 will take 3 years in 2010!

But these problems pale into insignificance when you consider the massive Square Kilometre Array (SKA) radio telescope, it is forecast to produce 100 ExaBytes (that’s one hundred million TeraBytes) of data annually!

This raises a number of very fundamental issues:

  • The terabit speed network technologies needed to get the data off the detectors does not exist (yet).
  • There is no storage technology to cope with the volumes of data.
  • This means they will need to process the data on the fly in a highly parallel manner.
  • This is a radio telescope, so there is no time when it cannot take data, unlike an optical ‘scope. This means you cannot somehow buffer the night time data and then process it during the day.
  • If the ESO estimate of 1 megawatt of power for 7 PB is correct, and assuming that power per PB stays roughly and they do store all 100 EB of data, then the storage of one years data will need about 14GW of generating capacity.

Fortunately construction of the SKA isn’t due to start until 2013, so we’ve got a bit of time to solve all these.. πŸ™‚

Google Co-Op – Annotating The Web

Looks like Google is working on a new service to allow users to add labels to topics that they (hopefully) know something about. The idea then is that other people then subscribe to your labels if they feel you are accurate and that then influences their search results. Sort of like routing by rumour protocols in computer networks.

So their intention is to get around the fact that webmasters don’t put explicit semantic markup in their pages yet by exploiting the fact that it’s much easier to get other people who know about topics to provide annotations for existing pages through a third party site that (many) others can then use in their normal searches.

I guess the first thing there that springs to mind for me is “what an opportunity for guerrilla marketing” – PR companies subscribe as “ordinary people”, but skew their recommendations towards the people paying them. If that sounds far fetched then don’t forget that techniques like this have been around for over 2 decades – consider it the marketeers version of computer security’s “social engineering“.

Twinings Tea – 300 Years Old This Year

I was making myself a nice cup of Twinings Irish Breakfast when I noticed on the side of the packet that they were founded in 1706 and have been at the same address in The Strand ever since. It also said that they had been “making tea for over 290 years”, so obviously the packaging predated 2006. πŸ™‚

They have a (Flash based) history of Twinings and Tea and, for those who can’t stand or read Flash, there is an HTML history on their US site.

Apparently they had the Govenor of Boston as a customer in 1773, though they claim that a writer of the time (unattributed unfortunately) recorded:

“…it was not Twinings tea the Boston rebels tossed into the sea.”

Obviously they too agreed that they do make a rather nice brew. πŸ™‚

At one stage they even had their own bank, though they eventually amalgamated with Lloyds bank in 1892.

Bin Laden Dead ?

The (Australian) ABC is publishing a Reuters story quoting a report in L’Est Republicain that the DGSE (the French intelligence agency) has briefed the French President and PM of Saudi Arabias belief that Bin Laden died of typhoid in August.

“The information gathered by the Saudis indicates that the head of Al Qaeda was a victim while he was in Pakistan on August 23, 2006, of a very serious case of typhoid, which led to a partial paralysis of his internal organs,” the document said.

The original article is in French, and the Google translation of the full paragraph says:

“According to a usually reliable source, the Saoudi services from now on would have acquired the conviction that Usama Bin Laden died. The elements collected by the Saoudis indicate that the chief of Al-QaΓ―da would have been victim, whereas it was in Pakistan on August 23, 2006, of a very strong crisis of typhoid having involved a paralysis partial of his lower limbs. Its geographical insulation, caused by a permanent escape, would have made impossible any medical care. On September 4, 2006, the Saoudi services of safety collected the first information making state of its death. They would wait, to obtain more details, and in particular the exact place of its burial, to announce the news officially”.

If it is true (and remember this is a report of a report of a report of a suspicion) then whilst being a significant moment it is unlikely to materially change the situation anywhere in the world. These groups tend to operate as lots of independent small cells with no central leadership, infrastructure or coordination and so whilst this might mean something philosophically to them it is unlikely to cause them any operational problems. πŸ™

Update: The ABC is reporting that the Saudi’s are denying the basis of the French DGSE report, saying:

“The Kingdom of Saudi Arabia has no evidence to support recent media reports that Osama bin Laden is dead,” the Saudi Embassy in the US said.

California Sues Car Companies & Exxon Secrets

Before I turn in for the night – the State of California has launched law suits against 6 car companies (GM, Toyota, Ford, Honda, Chrysler & Nissan) under the Federal Common Law of Public Nuisance. It contains this rather enlightening quote:

Defendants’ motor vehicle emissions in the United States account for approximately nice percent of the world’s carbon dioxide emissions

I don’t suppose I should be amazed by that, but it’s still a staggering statement – vehicle use in the US alone accounts for ~ 9% of global CO2 output.

You can read the actual law suit (PDF) mirrored at The Age. Thanks to my lovely wife for forwarding an email about the suit on to me..

On a related note, a friend and colleague (also called Chris) sent me a link to a site called where you can find out about the web of anti-climate-change organisations that get funding from Exxon and how they are connected.

First Experiences with KUbuntu Edgy Eft

Now got the current development release of KUbuntu (codenamed Edgy Eft) running on 3 machines, my desktop here at home, my own laptop and a work laptop. All seem to be working just fine and for the first time my home laptop has working 3D with its ATI mobile graphics chipset – I can even run Google Earth on it!

My desktop is now rid of the annoying rendering bugs of Google Earth on its ATI Radeon 9250 PRO and generally seems to be a better experience. No real change in boot time though, Bootchart still shows a boot time of about 38 seconds on this 2.6GHz P4 with a pair of software RAID-1 SATA drives. My guess is that I can probably shave a few seconds off that by telling it to boot without the splashscreen..

The only problem I’ve hit was on the work laptop with an Intel graphics chipset, KDM won’t start unless I’ve disabled the “splash” option which is a bit odd.

Update: Yup, taking out the “splash” boot option on my home desktop box took me down to 35 seconds according to bootchart.

