HOlistic Performance System Analysis (HOPSA)

To maximise the scientific output of a high-performance computing system, different stakeholders pursue different strategies. While individual application developers are trying to shorten the time to solution by optimising their codes, system administrators are tuning the configuration of the overall system to increase its throughput. Yet, the complexity of today's machines with their strong interrelationship between application and system performance presents serious challenges to achieving these goals.

The HOPSA project (HOlistic Performance System Analysis) therefore sets out to create an integrated diagnostic infrastructure for combined application and system tuning - with the former provided by the EU and the latter by the Russian project partners. Starting from system-wide basic performance screening of individual jobs, an automated workflow will route findings on potential bottlenecks either to application developers or system administrators with recommendations on how to identify their root cause using more powerful diagnostic tools. Developers can choose from a variety of mature performance-analysis tools developed by our consortium. Within this project, the tools will be further integrated and enhanced with respect to scalability, depth of analysis, and support for asynchronous tasking, a node-level paradigm playing an increasingly important role in hybrid programs on emerging hierarchical and heterogeneous systems.

Using our infrastructure, the scientific output rate of a system will be increased in three ways: First, the enhanced tool suite will lead to better optimisation results, expanding the potential of the codes to which they are applied. Second, integrating the tools into an automated diagnostic workflow will ensure that they are used both (i) more frequently and (ii) more effectively, further multiplying their benefit. Finally, our holistic approach will lead to a more targeted optimisation of the interactions between application and system.

HOPSA is a coordinated twin project funded under FP7-ICT-2011-EU-Russia grant number FP7-277463 and Russian Ministry of Education and Science contract number 07.514.12.4001.

Downloads

Project factsheet
Slides presented at SC11 HOPSA BoF

EU Project Partners (HOPSA-EU)

Forschungszentrum Jülich (EU Coordinator)
Jülich Supercomputing Centre
Barcelona Supercomputing Center
Computer Sciences Department
German Research School for Simulation Sciences
Laboratory for Parallel Programming
Rogue Wave Software AB
(formerly ACUMEM)
Technische Universität Dresden
Center for Information Services and High Performance Computing

Russian Project Partners (HOPSA-RU)

Moscow State University (RU Coordinator)
Research Computing Center
T-Platforms
Russian Academy of Sciences
Joint Supercomputer Center
Southern Federal University
Scientific Research Institute of Multiprocessor Computer Systems

EU Contact

Bernd Mohr, Forschungszentrum Jülich GmbH, JSC, 52425 Jülich, Germany,
Phone +49 2461 613218,
Email  b.mohr@fz-juelich.de

RU Contact

Vladimir Voevodin, Moscow State University, RCC, Moscow, Russia,
Email  voevodin@parallel.ru

Sponsors

EU FP7 MSE