Comparing Fortran Compilers

I’m just testing out the Fortran 90 compilers on our AMD quad core cluster Tango based on some code that Joe Landman wrote as a test case in January 2008, using the same input file as him. The compilers I’m using are GCC 4.3.3, Intel 11.0.81 and PGI 8.0-3.

For the unoptimised (-O0) version I get:

GCC: 1.884s
Intel: 3.891s
PGI: 1.170s

For basic optimisation (-O) I get:

GCC: 1.617s
Intel: 3.515s
PGI: 0.954s

Cranking up the optimisation to -O2 sees no change:

GCC: 1.610s
Intel: 3.514s
PGI: 0.954s

Now we add compiler specific flags:

GCC (-march=amdfam10 -O3): 0.956s
Intel (-fast): 3.507s
PGI (-fast -tp shanghai-64): 0.997s

That got me wondering which had the greater impact, -O3 or the -march=amdfam10 and the result was surprising:

GCC (-O3): 1.611s
GCC (-march=amdfam10 -O0): 1.238s

So that’s pretty conclusive, just enabling the AMD k10h CPU (i.e. Barcelona/Shanghai processors) with no optimisations gives a better speedup than the highest level of optimisation! Of course it’s better with both, as you can see from the previous set of results.

I’m *really* impressed by GCC’s performance there, as well as the PGI unoptimised speed, and disappointed by the Intel compilers general lack of performance. I suspect Intels answer would be (not unreasonably) that they don’t necessarily target performance for their competitors CPUs.

The Musings of Chris Samuel

Computers, science, archaeology and other random burblings

Related Posts