Performance Analysis of Parallel Applications

Date

March 14-16, 2012

Location

National Laboratory for HPC
Center for Mathematical Modeling
Universidad de Chile
Santiago, Chile

Organizing Institutions

This course was held within the framework of the ALECHILE exchange program between the Technical University of Munich (TUM, Munich, Germany) and the Universidad de La Frontera (UFRO, Temuco, Chile).

Impressions

Photographs from workshop with 16 participants

Overview

While today's supercomputers offer unprecedented levels of hardware performance, using them in a productive manner remains a major challenge. To write correct and efficient code, application developers typically have to be both experts in their specific field of science to find novel approaches to the problem they want to solve and computer scientists to understand and exploit the intricacies of the system for which their code is being designed. Moreover, access to parallelism is mostly offered via low-level interfaces that are hard to learn and whose performance behavior is hard to predict. Whereas in the business world much of the complexity of application development is hidden behind advanced programming frameworks, stagnating progress in programming techniques for high-performance computing is limiting developer productivity and, thus, often delaying scientific results.

With this course, we want to enable all participants to improve the quality and accelerate the development process of complex simulation programs in science and engineering that are being designed for the most advanced parallel computer systems. For this purpose, we are presenting several state-of-the-art tools for high-performance computing that assist domain scientists in analyzing and optimizing the performance of their applications. In these efforts, we place special emphasis on scalability and ease of use.

Concept

The course concept consists of three building blocks: first, talks on the tools themselves and, second, hands-on training exercises using typical scientific benchmark codes. Most importantly, we invite participants to bring along their own codes and analyse them with the presented tools. The course covers the following tools (in alphabetical order): PAPI, Periscope, Scalasca and Vampir.

The target audience is both scientific personnel supervising and administrating HPC computers, and researchers and students developing parallel applications.

The course covers both serial single-core/node performance issues and communication/synchronization performance on highly parallel systems. While the tools will be available on the Levque NLHPC cluster, we encourage participants to bring along their notebook computers too. The course material comprises not only exercises with benchmark codes to get familiar with the tools, we also provide an x86-compatible Linux Live-DVD with both the commercial and free tools such that each participant can try out the software before switching to the cluster.

Schedule

Day 1	*Wednesday 14th March*
08:30	(registration & set-up of notebook computers)
09:00	Welcome and Introduction to VI-HPS course [Ávila, Gerndt, Wylie]
09:30	Introduction to parallel application engineering [Gerndt]
10:00	Introduction to parallel performance analysis [Oleynik]
11:00	Building and running the NPB-MZ-MPI-BT example OpenMP+MPI code
11:45	(lunch)
14:30	Periscope introduction & overview [Gerndt,Oleynik] Periscope hands-on tutorial exercises
15:45	(break)
16:00	Scalasca introduction & overview [Wylie] Scalasca hands-on tutorial exercises
17:15	(adjourn)
Day 2	*Thursday 15th March*
09:00	Vampir introduction & overview [Petkov]
10:30	PAPI introduction & overview [Ávila]
12:00	(lunch)
13:30	Hands-on coaching with participants' codes [all]
17:00	(adjourn)
Day 3	*Friday 16th March*
09:00	Tools presentations & coaching: Tools presentations by representatives of IBM, Intel & SGI (OmegaSystem) Altenatively, each participant can sign up for a slot for coaching by a tool specialist to apply a specific tool to the participants' own code, interpret the performance analyses and discuss potential optimization opportunities.
12:00	(adjourn)

Due to class room size and resource constraints, the number of participants was limited, and priority given to those bringing running code to analyze on Levque.

Contact

Andrés Ávila Barrera (UFRO), phone +56 (45) 325921
Yury Oleynik (TUM), phone +49 (89) 289-17680