Virtual Institute — High Productivity Supercomputing

27th VI-HPS Tuning Workshop (LRZ, Garching, Germany)

Date

Monday 23rd - Friday 27th April 2018.

Location

The workshop will take place in Kursraum 2 (H.U.010) at LRZ, Leibniz Supercomputing Centre on the university campus Garching-bei-München, Germany.

Co-organising Institutions

LRZ IT4I PRACE

Goals

This workshop organised by VI-HPS, LRZ & IT4Innovations as a PRACE training event will:

  • give an overview of the VI-HPS programming tools suite
  • explain the functionality of individual tools, and how to use them effectively
  • offer hands-on experience and expert assistance using the tools

Programme Overview

Presentations and hands-on sessions are planned on the following topics:

  • Setting up, welcome and introduction
  • mpiP lightweight MPI profiling
  • MAQAO performance analysis & optimisation
  • Score-P instrumentation and measurement
  • Extra-P automated performance modeling
  • Scalasca automated trace analysis
  • Vampir interactive trace analysis
  • Paraver/Extrae/Dimemas trace analysis and performance prediction
  • JUBE script-based workflow execution environment
  • Periscope/PTF automated performance analysis and optimisation
  • TAU performance system
  • MUST runtime error detection for MPI
  • ARCHER runtime error detection for OpenMP
  • [k]cachegrind cache utilisation analysis

A brief overview of the capabilities of these and associated tools is provided in the VI-HPS Tools Guide.

The workshop will be held in English and run from 09:00 to not later than 18:00 each day, with breaks for lunch and refreshments. For participants from public research institutions in PRACE countries, the participation fee is sponsored through the PRACE training program. All participants are responsible for their own travel and accommodation.

A social event for participant and instructor networking is planned for the evening on Tuesday 24 April, consisting of a guided tour of the Weihenstephan Brewery sponsored by MEGWARE, followed by a self-paid dinner at a local restaurant.

Classroom capacity is limited, therefore priority may be given to applicants with parallel codes already running on the workshop computer system (CooLMUC3), and those bringing codes from similar Xeon Phi x86 cluster systems to work on. Participants are therefore encouraged to prepare their own MPI, OpenMP and hybrid MPI+OpenMP parallel application codes for analysis.

TW27@LRZ class TW27@LRZ social event group

Programme in Detail (provisional)

Day 1: Monday 23 April
08:30 (registration & set-up of course accounts on workshop computers)
09:00 Welcome [Dieter Kranzlmüller, LRZ]
  • Introduction to VI-HPS & overview of tools [Martin Schulz, TUM]
  • Introduction to parallel performance engineering [Josef Weidendorfer, LRZ]
  • CooLMUC3 computer system and software environment [Volker Weinberg, LRZ]
  • Building and running NPB/BT-MZ on CooLMUC3 [Ilya Zhukov, JSC]
  • 10:30 (break)
    11:00 mpiP lightweight MPI profiling [Martin Schulz, TUM]
  • mpiP hands-on exercises
  • MAQAO performance analysis tools [Cédric Valensi, Emmanuel Oseret & Salah Ibn Amar, UVSQ]
  • MAQAO hands-on exercises (MAQAO quick reference)
  • 12:30 (lunch)
    13:30 Hands-on coaching to apply tools to analyze participants' own code(s).
    17:30 Review of day and schedule for remainder of workshop
    18:00 (adjourn)

    Day 2: Tuesday 24 April
    09:00 Score-P instrumentation & measurement toolset [Ronny Tschüter, TUD]
  • Score-P hands-on exercises
  • CUBE profile explorer hands-on exercises [Ilya Zhukov, JSC]
  • 10:30 (break and group picture)
    11:00 Score-P analysis scoring & measurement filtering  [R. Tschüter, TUD]
  • Measuring hardware counters and other metrics 
  • Extra-P automated performance modeling [Sergei Shudler, TUDarmstadt]
  • Extra-P hands-on exercises
  • 12:30 (lunch)
    13:30 Hands-on coaching to apply tools to analyze participants' own code(s).
    17:30 Review of day and schedule for remainder of workshop
    18:00 (adjourn)
    * Social event: Guided tour of Weihenstephan Brewery and dinner

    Day 3: Wednesday 25 April
    09:00 Scalasca automated trace analysis [Ilya Zhukov, JSC]
  • Scalasca hands-on exercises
  • Vampir interactive trace analysis [Matthias Weber, TUDresden]
  • Vampir hands-on exercises
  • 10:30 (break)
    11:00 Paraver tracing tools suite [Judit Giménez & Lau Mercadal, BSC]
  • Paraver hands-on exercises
  • 12:30 (lunch)
    13:30 Hands-on coaching to apply tools to analyze participants' own code(s).
    15:00 (optional guided tour of LRZ double cube)
    17:30 Review of day and schedule for remainder of workshop
    18:00 (adjourn)

    Day 4: Thursday 26 April
    09:00 TAU performance system [Sameer Shende, UOregon]
  • TAU hands-on exercises
  • 10:30 (break)
    11:00 JUBE workflow execution environment [Thomas Breuer, JSC]
  • JUBE hands-on exercises
  • Periscope Tuning Framework [Robert Mijakovic, TUM]
  • Periscope hands-on exercises
  • 12:30 (lunch)
    13:30 Hands-on coaching to apply tools to analyze participants' own code(s).
    17:30 Review of day and schedule for remainder of workshop
    18:00 (adjourn)

    Day 5: Friday 27 April
    09:00 MUST MPI runtime error detection [Joachim Protze, RWTH]
  • MUST hands-on exercises
  • ARCHER OpenMP runtime error detection [Joachim Protze, RWTH]
  • ARCHER hands-on exercises
  • 10:30 (break)
    11:00 Kcachegrind cache analysis [Josef Weidendorfer, LRZ]
  • Kcachegrind hands-on exercises
  • Review
    12:30 (lunch)
    13:30 Hands-on coaching to apply tools to analyze participants' own code(s).
    17:00 (adjourn)
     

    Hardware and Software Platforms

    CooLMUC3: Megware KNL-based x86 Linux cluster system:

    • 148 compute nodes each with single Intel Xeon Phi 7210-F 'Knights Landing' MIC processors (1.3GHz, 64 cores per processor, 4 hardware threads per core) and 96GB RAM and 16GB HBM
    • cluster modes: quad, snc4, a2a
    • memory modes: flat, cache, hybrid
    • network: Intel OmniPath interconnect
    • parallel filesystem: GPFS (SCRATCH & WORK)
    • software: SLES12-based GNU/Linux, Intel MPI; Intel, GCC and other compilers; SLURM batchsystem

    The local HPC system CooLMUC3 is the primary platform for the workshop and will be used for the hands-on exercises. Course accounts will be provided during the workshop to participants without existing accounts. Other systems where up-to-date versions of the tools are installed can also be used when preferred, though support may be limited and participants are expected to already possess user accounts on non-local systems. Regardless of whichever systems they intend to use, participants should be familiar with the relevant procedures for compiling and running their parallel applications (via batch queues where appropriate).

    Registration

    Register via the PRACE training portal: the number of participants is limited.

    The workshop will be held in Leibniz Rechenzentrum on the university campus outside Garching bei München, approximately 25 min north from the city centre of Munich. The U-bahn line U6 (station: Garching-Forschungszentrum) provides direct connection from the campus area to both Munich and Garching.
    Getting to/from LRZ

    It is recommended to choose a hotel in Garching or Munich city centre and use the U-bahn to reach LRZ.
    Accommodation in Garching
    Accommodation in Munich

    Contact

    Tuning Workshop Series

    Brian Wylie
    Jülich Supercomputing Centre
    Forschungszentrum Jülich GmbH
    Phone: +49 2461 61-6589
    Email: b.wylie[at]fz-juelich.de
       

    Local Arrangements

    Volker Weinberg
    Leibniz Supercomputing Centre
    Garching-bei-München
    Phone: +49 89 35831 8863
    Email: weinberg[at]lrz.de

    Sponsors

    PRACE This workshop is a PRACE training event, organised by VI-HPS, LRZ & IT4Innovations. MEGWARE