Virtual Institute — High Productivity Supercomputing

30th VI-HPS Tuning Workshop (BSC, Barcelona, Spain)

Date

Monday 21st - Friday 25th January 2019.

Location

The workshop will take place at Barcelona Supercomputing Center (BSC), C6 building E-106, BSC Campus Nord, Barcelona, Spain.

Organizing Institutions

BSC PRACE

Goals

This workshop organized by VI-HPS for the Spanish PRACE Advanced Training Centre hosted by Barcelona Supercomputing Center will:

  • give an overview of the VI-HPS programming tools suite
  • explain the functionality of individual tools, and how to use them effectively
  • offer hands-on experience and expert assistance using the tools

Programme Overview

Presentations and hands-on sessions are planned on the following topics:

  • Setting up, welcome and introduction
  • Paraver trace analysis tool
  • Dimemas performance prediction tool
  • Score-P instrumentation and measurement infrastructure
  • TAU performance system
  • Scalasca automated trace analysis toolset
  • Extra-P automated performance modeling
  • Vampir interactive trace analysis toolset
  • MUST runtime error detection tool for MPI
  • ARCHER runtime error detection for OpenMP
  • MAQAO binary analysis & optimization tool
  • ... and others

A brief overview of the capabilities of these and associated tools is provided in the VI-HPS Tools Guide.

The workshop will be held in English and run from 09:00 to not later than 18:00 each day, with breaks for lunch and refreshments. There is no fee for participation, however, participants are responsible for their own travel and accommodation.

Classroom capacity is limited, therefore priority will be given to applicants with parallel codes already running on the workshop computer systems (MareNostrum-IV and CTE-POWER), and those bringing codes from similar systems to work on. Participants are therefore encouraged to prepare their own MPI, OpenMP and hybrid MPI+OpenMP parallel application codes for analysis.

Programme

Day 1: Monday 21 January
09:00 Welcome Lab setup
10:45 (break)
11:15 Paraver tracing tools suite [Judit Giménez, German Llort & Lau Mercadal, BSC]
13:00 (lunch)
14:00 Hands-on coaching to apply tools to analyze participants' own code(s).
17:00 Review of day and schedule for remainder of workshop
17:30 (adjourn)
Day 2: Tuesday 22 January
09:00 Score-P instrumentation & measurement [Michael Knobloch, JSC] Score-P analysis scoring & filtering  [Michael Knobloch, JSC]
  • Measuring hardware counters and other metrics
10:45 (break)
11:15 TAU performance system [Sameer Shende, UOregon]
13:00 (lunch)
14:00 Hands-on coaching to apply tools to analyze participants' own code(s).
17:00 Review of day and schedule for remainder of workshop
17:30 (adjourn)
Day 3: Wednesday 23 January
09:00 Scalasca automated trace analysis [Michael Knobloch, JSC]
  • Scalasca hands-on exercises
10:45 (break)
11:15 Extra-P automated scaling analysis [Alexandru Calotoiu, TUDarmstadt]
  • Extra-P hands-on exercises
13:00 (lunch)
14:00 Hands-on coaching to apply tools to analyze participants' own code(s).
17:00 Review of day and schedule for remainder of workshop
17:30 (adjourn)
Day 4: Thursday 24 January
09:00 Vampir interactive trace analysis [Holger Brunst, TUDresden]
10:45 (break)
11:15 MUST MPI usage & ARCHER OpenMP correctness checking [Joachim Protze, RWTH]
13:00 (lunch)
14:00 Hands-on coaching to apply tools to analyze participants' own code(s).
17:00 Review of day and schedule for remainder of workshop
17:30 (adjourn)
Day 5: Friday 25 January
09:00 MAQAO x86 performance analysis tools [Emmanuel Oseret & Cedric Valensi, UVSQ]
10:45 (break)
11:15 Intel Advisor and Roofline model [Egor Kazachkov, Intel]
12:30 Review of workshop
13:00 (lunch)
14:00 Hands-on coaching to apply tools to analyze participants' own code(s).
17:00 (adjourn)
 

Hardware and Software Platforms

MareNostrum-IV: Lenovo SD530 racks with a total of 3,456 compute nodes with dual Intel Xeon Platinum 8160 (Skylake) 2.1 GHz 24-core processors with 2-way SMT and 96 GB memory per node, Intel Omni-Path interconnection network, SuSE 12SP2 Linux, Intel & GCC compilers, Intel and other MPI libraries, SLURM batch software, GPFS parallel filesystem.

CTE-POWER: IBM AC922 racks with a total of 54 compute nodes with dual IBM Power9 8335-GTG 3.0 GHz 20-core processors with 4-way SMT and 512 GB memory and four NVIDIA V100 (Volta) GPUs with 16 GB HBM2 per node (similar to 2018/11 Top500#1 "Summit" and #2 "Sierra"), single-port Mellanox EDR Infiniband interconnection network, Red Hat Enterprise Linux Server 7.4, IBM XL, GCC and PGI compilers, IBM Spectrum MPI and OpenMPI, SLURM batch software, GPFS parallel filesystem.

The local HPC systems are the primary platform for the workshop. Course accounts will be provided during the workshop to participants. Other systems where up-to-date versions of the tools are installed can also be used when preferred, though support may be limited and participants are expected to already possess user accounts on non-local systems. Regardless of whichever systems they intend to use, participants should be familiar with the relevant procedures for compiling and running their parallel applications (via batch queues where appropriate).

Participants are expected to bring and use their own notebook computers with SSH and X11 software configured to connect to the HPC systems and run interactive graphical tools.

Registration

Registration was via the PRACE training portal.

Contact

Judit Gimenez
Barcelona Supercomputing Center
Phone: +34 93 401-7178
Email: judit.gimenez@bsc.es

Brian Wylie
Jülich Supercomputing Centre
Forschungszentrum Jülich
Phone: +49 2461 61-6589
Email: b.wylie@fz-juelich.de

Sponsors

PRACE This workshop is a PRACE Training Centre (PTC) course, organised by Barcelona Supercomputing Center.