Virtual Institute — High Productivity Supercomputing

38th VI-HPS Tuning Workshop (NHR@FAU, Erlangen, Germany) - Online


Monday 01st - Wednesday 03rd March 2021


The workshop will be held online, using the Zoom videoconference platform.

Organising Institutions



This workshop organised by VI-HPS and Erlangen National High Performance Computing Center will:

  • give an overview of the VI-HPS programming tools suite
  • explain the functionality of individual tools, and how to use them effectively
  • offer hands-on experience and expert assistance using the tools

On completion participants should be familiar with common performance analysis and diagnosis techniques and how they can be employed in practice (on a range of HPC systems). Those who prepared their own application test cases will have been coached in the tuning of their measurement and analysis, and provided optimization suggestions.

Programme Overview

Presentations and hands-on sessions are planned on the following topics:

  • Setting up, welcome and introduction
  • TAU performance system
  • MAQAO performance analysis & optimisation
  • Score-P instrumentation and measurement
  • Paraver/Extrae/Dimemas trace analysis and performance prediction
  • Extra-P automated performance modeling
  • ... and potentially others to be added

A brief overview of the capabilities of these and associated tools is provided in the VI-HPS Tools Guide.

The workshop will be held in English and run from 09:00 to not later than 18:00 each day, with breaks.

Participants are encouraged to prepare their own MPI, OpenMP and hybrid MPI+OpenMP parallel application codes for analysis. Codes using multiple GPUs via OpenACC, OpenCL or CUDA may also be analysed.

Programme in Detail (provisional) - all times given as CET (UTC+1)

Day 1: Monday 01 March
09:00 Welcome [Georg Hager, NHR@FAU]
  • Workshop agenda [Cédric Valensi, UVSQ]
  • Introduction to Zoom
  • Introduction to NHR and the Meggie system [Thomas Gruber, NHR@FAU]
  • Introduction to VI-HPS & overview of tools [Cédric Valensi, UVSQ]
  • Introduction to parallel performance engineering
  • Building and running NPB/BT-MZ on Meggie [Michael Knobloch, JSC]
    10:30 (break)
    11:00 MAQAO performance analysis tools [Jäsper Ibnamar & Emmanuel Oseret, UVSQ]
  • MAQAO hands-on exercises (MAQAO quick reference)
  • 12:30 (lunch)
    13:45 Hands-on coaching to apply MAQAO to analyze participants' own code(s).
    15:15 (break)
    15:30 TAU performance system [Sameer Shende, UOregon]
  • TAU hands-on exercises
  • 16:30 Schedule for remainder of workshop
    17:00 (adjourn)

    Day 2: Tuesday 02 March
    09:00 Score-P instrumentation & measurement toolset [Michael Knobloch, JSC]
  • Score-P analysis scoring & measurement filtering
  • Score-P specialized instrumentation and measurement
  • Score-P hands-on exercises
  • CUBE profile explorer hands-on exercises
  • 10:30 (break)
    11:00 Extra-P automated performance modeling [Marcus Ritter, TUDarmstadt]
  • Extra-P hands-on exercises
  • 12:30 (lunch)
    13:45 Hands-on coaching to apply Score-P/CUBE to analyze participants' own code(s).
    15:15 (break)
    15:30 Hands-on coaching to apply TAU to analyze participants' own code(s).
    17:30 Review of day and schedule for remainder of workshop
    18:00 (adjourn)

    Day 3: Wednesday 03 March
    09:00 BSC performance tools [Judit Giménez & Lau Mercadal, BSC]
  • Tools installation
  • BSC tools hands-on exercises
  • 10:30 (break)
    11:00 Hands-on coaching to apply BSC tools to analyze participants' own code(s).
    12:30 (lunch)
    14:00 Hands-on coaching to apply tools to analyze participants' own code(s).
    17:00 (adjourn)

    Hardware and Software Platforms


    • 728 compute nodes, each with two Intel Xeon E5-2630v4 „Broadwell“ chips (10 cores per chip + SMT) running at 2.2 GHz with 25 MB Shared Cache per chip and 64 GB of RAM.
    • 2 front end nodes with the same CPUs as the compute nodes but 128 GB of RAM.
    • Lustre-based parallel filesystem with a capacity of almost 1 PB and an aggregated parallel I/O bandwidth of > 9000 MB/s.
    • Intel OmniPath interconnect with up to 100 GBit/s bandwidth per link and direction.

    The local HPC system Meggie is the primary platform for the workshop and will be used for the hands-on exercises. Course accounts will be provided during the workshop to participants without existing accounts. Other systems where up-to-date versions of the tools are installed can also be used when preferred, though support may be limited and participants are expected to already possess user accounts on non-local systems. Regardless of whichever external systems they intend to use, participants should be familiar with the relevant procedures for compiling and running their parallel applications (via batch queues where appropriate).


    To register for the course, please send an e-mail to Georg Hager georg.hager[at] with the following content:

    • Subject: VI-HPS Tuning Workshop 38 registration
    • Body: Please state:
      • Full name
      • Affiliation
      • Country


    Tuning Workshop Series

    Cédric Valensi
    Université de Versailles Saint-Quentin-en-Yvelines
    Phone: +33 1 77 57 59 36
    Email: cedric.valensi[at]

    Local Arrangements

    Georg Hager
    Regionales Rechenzentrum Erlangen
    Phone: +49 9131 85-28973
    Email: georg.hager[at]