Research Challenges in Astronomy

I’ve been at the first APAC All Hands Meeting this week, generally hearing what all the other people in the APAC Grid project are up to and meeting folks from around the country that I only otherwise get to see via Access Grid.

Today was the turn of some of the science areas to tell us what they are up to and what they see their big challenges being, and the most scariest (from an HPC perspective) was the session on Astronomy and Astrophysics by Peter Quinn (formerly of the ESO and now a Premier’s Fellow at UWA).

The most intimidating points I picked up from his presentation were:

  • Data explosion – doubling time T2 was < 12 months, with new big survey projects such as VST and VISTA that will become T2 < 6 months!
  • Disk technology T2 is 10 years at present (according to Peter), and slowing.
  • The Large Synoptic Survey Telescope is reckoned to capable of producing 7 PetaBytes per annum of data.
  • The ESO’s data archive is currently (2006) 100TB in 10 racks and using 70kW of power. By 2012 it is forecast to be 10PB in 100 racks and consuming 1MW of electricity.
  • A recent Epoch of Reionisation simulation of 5,0003 particles on a 1,000 CPU Opteron cluster used 2 months of CPU time and 10TB physical RAM (about 10GB per core) and produced about 100TB of output data.
  • Catalogue sizes are exploding, in 2000 there were about 100,000 galaxies in a catalogue, by 2010 that will be 1 billion.
  • Algorithms are not scaling with these data sizes – an algorithm that took 1 second in 2000 will take 3 years in 2010!

But these problems pale into insignificance when you consider the massive Square Kilometre Array (SKA) radio telescope, it is forecast to produce 100 ExaBytes (that’s one hundred million TeraBytes) of data annually!

This raises a number of very fundamental issues:

  • The terabit speed network technologies needed to get the data off the detectors does not exist (yet).
  • There is no storage technology to cope with the volumes of data.
  • This means they will need to process the data on the fly in a highly parallel manner.
  • This is a radio telescope, so there is no time when it cannot take data, unlike an optical ‘scope. This means you cannot somehow buffer the night time data and then process it during the day.
  • If the ESO estimate of 1 megawatt of power for 7 PB is correct, and assuming that power per PB stays roughly and they do store all 100 EB of data, then the storage of one years data will need about 14GW of generating capacity.

Fortunately construction of the SKA isn’t due to start until 2013, so we’ve got a bit of time to solve all these.. 🙂