« A new approach in urban modeling & simulation from HPC Project and Parallel Geometry | HomePage

Jul 24, 2008

ISC08: Back to complex programming ?

Supercomputing Europe in Dresden, Germany (ISC08) is over and brought its new announcements in High Performance Computing. This year, two major symbolic milestones have been announced: the first machine over one PFLOPS (10^15 floating operations per second), the IBM Roadrunner, and the first main-stream chips reaching one TFLOPS (10^12 floating operations per second), the GPUs from nVidia and AMD/ATI.

Well, symbols are nothing else than abstract concepts but these symbols remind us that as the computing world evolves faster and faster, both applications and their programmers need to evolve to use these new sources of efficiency.

Compared with last Supercomputing show in Reno, there were fewer application and tools vendors at the exhibit. However, major hardware vendors and many research institutes were attending.

The pre-conference and the Exhibitor Forum were the occasion for interesting technological presentations and announcements. From the hot-chip point of view, here's what we've noticed:

  • a special session for the IBM Roadrunner breaking the PFLOPS barrier by using both AMD Opteron and Cell CBE processors;
  • AMD presentation about its new 45-nm 4-core Shanghai for 2008 and the coming 6-core Istanbul in 2009. These chips will also come out of the fab just in Dresden by the way...
  • Intel presentation on its enterprise-class processor lines
    • the Itanium 9000
    • the Xeon MP 7000 and 5000, respectively for an expandable and energy efficient version
    • the Nehalem as their second generation 4-core 2-SMT processor due for the end of 2008
  • AMD/ATI presentation on the FireStream 9250, reaching the TFLOPS with 8 GFLOPS/W -many programming environments are available (Brook+, ACML, RapidMind, CAL, OpenCL, HMPP...)
  • nVidia presentation about the for 240-core T10P with 1.4 G transistors, also reaching the TFLOPS around 6.3 GFLOPS/W

If the hardware is only one (big) part of the HPC, many presentations were focused on how to program such "beasts".

Parallelism has been announced as a the inescapable way for higher performance for more than 50 years (« A Computer Oriented Toward Spatial Problems », S. H. Unger, 1958 for a SIMD computer or « Gamma 60 », Bull, 1957 for a SMT computer). However, up to now, hardware architects have been kind enough to squeeze the Moore's law to just make the classical Von Neumann & Co fast enough for most of the humans.

But now, the Moore's law no longer means faster processors but rather more cores at the same speed... So programs must be parallelized to run faster.

As usual, Thomas Sterling from Louisiana State University made an interesting presentation with a humorous slide to sum up the situation:  Core Trek (the next generation?)  In Exascale Computing: Space is the "final frontier" To Boldly Code, where no thread has gone before...

So, what can we imagine regarding the future of programming ?

Express more and more things by hand by using old programming languages. This require the programmers to be expert in deep computer architecture whereas they are often experts in their application domains which are more and more complex by themselves. Therefore, we'll have to have teams of experts in both domains: application and HPC (!)...

Another solution would be to rewrite the applications to use higher level languages that can hide the hardware complexity to the programmer but allow more parallelism exposition. Of course, it generaly requires a major - and costly - reengineering of the original code.

Last possibility : go on and program in the classical way... and wait for other solutions to appear. We may have some ideas here, at HPC Project...