Virtual Institute — High Productivity Supercomputing

Performance Analysis of Parallel Applications


March 14-16, 2012


National Laboratory for HPC
Center for Mathematical Modeling
Universidad de Chile

Santiago, Chile

Organizing Institutions

This course was held within the framework of the ALECHILE exchange program between the Technical University of Munich (TUM, Munich, Germany) and the Universidad de La Frontera (UFRO, Temuco, Chile).



Photographs from workshop with 16 participants

poster presenters exercise classwork



While today's supercomputers offer unprecedented levels of hardware performance, using them in a productive manner remains a major challenge. To write correct and efficient code, application developers typically have to be both experts in their specific field of science to find novel approaches to the problem they want to solve and computer scientists to understand and exploit the intricacies of the system for which their code is being designed. Moreover, access to parallelism is mostly offered via low-level interfaces that are hard to learn and whose performance behavior is hard to predict. Whereas in the business world much of the complexity of application development is hidden behind advanced programming frameworks, stagnating progress in programming techniques for high-performance computing is limiting developer productivity and, thus, often delaying scientific results.

With this course, we want to enable all participants to improve the quality and accelerate the development process of complex simulation programs in science and engineering that are being designed for the most advanced parallel computer systems. For this purpose, we are presenting several state-of-the-art tools for high-performance computing that assist domain scientists in analyzing and optimizing the performance of their applications. In these efforts, we place special emphasis on scalability and ease of use.


The course concept consists of three building blocks: first, talks on the tools themselves and, second, hands-on training exercises using typical scientific benchmark codes. Most importantly, we invite participants to bring along their own codes and analyse them with the presented tools. The course covers the following tools (in alphabetical order): PAPI, Periscope, Scalasca and Vampir.

The target audience is both scientific personnel supervising and administrating HPC computers, and researchers and students developing parallel applications.

The course covers both serial single-core/node performance issues and communication/synchronization performance on highly parallel systems. While the tools will be available on the Levque NLHPC cluster, we encourage participants to bring along their notebook computers too. The course material comprises not only exercises with benchmark codes to get familiar with the tools, we also provide an x86-compatible Linux Live-DVD with both the commercial and free tools such that each participant can try out the software before switching to the cluster.


Day 1 Wednesday 14th March
08:30 (registration & set-up of notebook computers)
09:00 Welcome and Introduction to VI-HPS course [Ávila, Gerndt, Wylie]
09:30 Introduction to parallel application engineering [Gerndt]
10:00 Introduction to parallel performance analysis [Oleynik]
11:00 Building and running the NPB-MZ-MPI-BT example OpenMP+MPI code
11:45 (lunch)
14:30 Periscope introduction & overview [Gerndt,Oleynik]
  • Periscope hands-on tutorial exercises
  • 15:45 (break)
    16:00 Scalasca introduction & overview [Wylie]
  • Scalasca hands-on tutorial exercises
  • 17:15 (adjourn)
    Day 2 Thursday 15th March
    09:00 Vampir introduction & overview [Petkov]
    10:30 PAPI introduction & overview [Ávila]
    12:00 (lunch)
    13:30 Hands-on coaching with participants' codes [all]
    17:00 (adjourn)
    Day 3 Friday 16th March
    09:00 Tools presentations & coaching:
  • Tools presentations by representatives of IBM, Intel & SGI (OmegaSystem)
  • Altenatively, each participant can sign up for a slot for coaching by a tool specialist to apply a specific tool to the participants' own code, interpret the performance analyses and discuss potential optimization opportunities.
  • 12:00 (adjourn)

    Due to class room size and resource constraints, the number of participants was limited, and priority given to those bringing running code to analyze on Levque.


    Andrés Ávila Barrera (UFRO), phone +56 (45) 325921
    Yury Oleynik (TUM), phone +49 (89) 289-17680