Performance Analysis of Parallel Applications
March 14-16, 2012
This course was held within the framework of the ALECHILE exchange program between the Technical University of Munich (TUM, Munich, Germany) and the Universidad de La Frontera (UFRO, Temuco, Chile).
Photographs from workshop with 16 participants
While today's supercomputers offer unprecedented levels of hardware performance, using them in a productive manner remains a major challenge. To write correct and efficient code, application developers typically have to be both experts in their specific field of science to find novel approaches to the problem they want to solve and computer scientists to understand and exploit the intricacies of the system for which their code is being designed. Moreover, access to parallelism is mostly offered via low-level interfaces that are hard to learn and whose performance behavior is hard to predict. Whereas in the business world much of the complexity of application development is hidden behind advanced programming frameworks, stagnating progress in programming techniques for high-performance computing is limiting developer productivity and, thus, often delaying scientific results.
With this course, we want to enable all participants to improve the quality and accelerate the development process of complex simulation programs in science and engineering that are being designed for the most advanced parallel computer systems. For this purpose, we are presenting several state-of-the-art tools for high-performance computing that assist domain scientists in analyzing and optimizing the performance of their applications. In these efforts, we place special emphasis on scalability and ease of use.
The course concept consists of three building blocks: first, talks on the tools themselves and, second, hands-on training exercises using typical scientific benchmark codes. Most importantly, we invite participants to bring along their own codes and analyse them with the presented tools. The course covers the following tools (in alphabetical order): PAPI, Periscope, Scalasca and Vampir.
The target audience is both scientific personnel supervising and administrating HPC computers, and researchers and students developing parallel applications.
The course covers both serial single-core/node performance issues and communication/synchronization performance on highly parallel systems. While the tools will be available on the Levque NLHPC cluster, we encourage participants to bring along their notebook computers too. The course material comprises not only exercises with benchmark codes to get familiar with the tools, we also provide an x86-compatible Linux Live-DVD with both the commercial and free tools such that each participant can try out the software before switching to the cluster.
|Day 1||Wednesday 14th March|
|08:30||(registration & set-up of notebook computers)|
|09:00||Welcome and Introduction to VI-HPS course [Ávila, Gerndt, Wylie]|
|09:30||Introduction to parallel application engineering [Gerndt]|
|10:00||Introduction to parallel performance analysis [Oleynik]|
|11:00||Building and running the NPB-MZ-MPI-BT example OpenMP+MPI code|
|14:30||Periscope introduction & overview
|16:00||Scalasca introduction & overview
|Day 2||Thursday 15th March|
|09:00||Vampir introduction & overview [Petkov]|
|10:30||PAPI introduction & overview [Ávila]|
|13:30||Hands-on coaching with participants' codes [all]|
|Day 3||Friday 16th March|
|09:00||Tools presentations & coaching:
Due to class room size and resource constraints, the number of participants was limited, and priority given to those bringing running code to analyze on Levque.