OPARI2 2.0.5 OPEN ISSUES ======================== Effective: July 2019 This file lists known limitations and unimplemented features of the OPARI2 component. ------------------------------------------------------------------------------ * Platform support - OPARI2 has been tested on the following platforms: + Cray XC systems + K Computer + various Linux/Intel (x86/x64/Power8/ARM AArch32/AArch64) clusters The provided configure options (see INSTALL) may provide a good basis for building and testing the toolset on other systems. - The following platforms have not been tested recently: + IBM Blue Gene/Q + IBM Blue Gene/P + Cray XT, XE, XK + Sun Solaris/SPARC-based clusters However the supplied buildsystem might still work on these systems. - The following platforms have not been tested: + NEC SX-9 + IBM Blue Gene/L + SiCortex systems + other NEC SX systems -------------------------------------------------------------------------------- Known issues ------------ - All languages + OPARI2, per default, processes source files before the compiler preprocessor, so macros and included files are not processed. Conditionally compiled source code is also not resolved and can therefore result in erroneous instrumentation of partial OpenMP directives. These limitation can be resolved by passing preprocessed code to OPARI2 using the --preprocessed flag. + The instrumented source files generated by OPARI2 may confuse automatic dependency tracking by "make", "autotools", etc. For autotools, configure with "--disable-dependency-tracking". + Literal file-filter rules like "INCLUDE bt.f" for files that will be processed by OPARI2 do not work, as OPARI2 changes the file name (e.g., to bt.opari.f). + Some OpenMP compilers (e.g. PGI) are non-standard-conforming in the way they process OpenMP directives by not allowing macro replacement of OpenMP directive parameters. This results in error messages containing references to POMP_DLIST_##### where ##### is a five-digit number. In this case, try to use the OPARI2 option "--nodecl". This is unfortunately not a perfect workaround, as this can trigger other errors in some rare cases. + When compiling with the PGI compiler version 10.1, local variables that are defined after an OpenMP for directive share the same memory address. This breaks the OPARI2 instrumentation for task tracking. Our recommendation is to use a newer compiler version, According to our tests, later compiler versions have fixed this issues. We tested with PGI compiler version 11.7. + Sometimes instrumentation of OpenMP source files work, but the traces get enormously large because the application is using large numbers (millions) of small OpenMP synchronization operations like atomic, locks or flushes which are instrumented by default. Also, in that case, the instrumentation overhead might become excessive. In that case, you can tell OPARI2 not to instrument these directives by using the "--disable=omp[:directive|group[:inner],...]" option. Valid values for directive are: atomic, critical, flush, locks, master, ordered, single or "sync" which disables all of the above. Of course, then these directives are not measured and you should keep this in mind, when you analyze the results. Although they do not show up in the analysis report that the application might still have some performance problem because. This may especially be the case, because of too many OpenMP synchronization calls! + Instrumented Intel Xeon-Phi offload regions cause compile errors. You need to manually guard the instrumented POMP2 calls by "#ifndef __INTEL_OFFLOAD" preprocessor statements to prevent them from being compiled. The next OPARI2 release will automate this process. + Object files created by the Intel compiler with interprocedural optimization (-ipo) cannot be analyzed by nm for startup initialization of the OpenMP region handles. Workaround is to use only -ip during compilation or to use runtime initialization of the OpenMP region handles inside of the analysis tool. + The option --omp-tpd cannot be used on Fujitsu systems (K, FX10, FX100). + Parentheses used in chunksize calculations within a schedule clause, e.g., schedule(dynamic, (a)*(b)), will cause a compilation error. As a workaround, you can calculate the chuncksize into a temporary variable prior to its usage in the schedule clause. - Fortran: + The !$OMP END DO and !$OMP END PARALLEL DO directives are required (and are not optional as described in the OpenMP specification) for F77 style do loops which end with a lable and continue statement. + The atomic expression controlled by a !$OMP ATOMIC directive has to be on a line all by itself in fixed form source files. + The !$OMP END ATOMIC directive must not be used if it is optional. This directive is optional except if the capture clause is used. In this case the instrumentation of atomic directives needs to be disabled via by passing --disable=omp:atomic to opari2. + The Fortran95 statement terminator (";") is not handled correctly when it is used within parallel loops. + If an #ifdef block is used at the beginning of the variable definition part, instrumentation is incorrectly inserted within the block and not compiled later when the evaluation is false. + Some Fortran compilers (e.g., Sun) don't fully support C preprocessor commands, especially the "#line" commands. In case you track a compilation error on a OPARI2 modified/instrumented file down to such a statement, try using "--nosrc" as this suppresses the generation of "#line" statements. (With the Sun Fortran compiler, using "-xpp=cpp" is a better workaround.) + The first SECTION directive inside a SECTIONS workshare directive is required (and is not optional as described in the OpenMP specification). + Fortran .f files are identified as Fortran77 files even if they contain Fortran90 code. You need to manually add the --f90 option to process these files successfully if renaming is not an option. + If you use a fixed form Fortran and override the fixed form requirement by a compiler switch you might need to add --free-form to the OPARI2 commandline to get a compilable instrumentation. Use --fix-form analogously. + OPARI2 requires that the program is included in a PROGRAM block. Using the compilation unit as implicit PROGRAM block will result in erroneous code. + Compiler specific directives starting with "!DIR$" need to be followed by a whitespace character before the directive keyword. + The clauses 'num_threads' and 'if' to the parallel directive must not reside as a sole clause on a line. + OpenMP directives that span over several lines must not contain preprocessor directives. This is a regression from OPARI2 version 1.1.4. + The default(none) clause to the parallel directive must not be used in a continuation line but on the same line as the parallel directive. + Strings within a pair of parenthesis, e.g., (..., msg="my msg"), must either appear on a single line or the closing parenthesis needs to appear on a line following the end quotation mark of the string. Otherwise, include statements might appear not at the correct place. - C/C++: + Structured blocks describing the extent of an OpenMP pragma need to be either compound statements {....}, while loops, or simple statements. In addition, for loops are supported after and . Complex statements like if-then-else or do-while need to be enclosed in a block ( {....} ). + C99 6.10.9 _Pragma operators are not supported. + Codes relying on the definition of macros to select specific features from system header files, e.g., __STDC_CONSTANT_MACROS or _GNU_SOURCE, need to define these macros on the compiler command line rather than in header or source files. It is planned to address these limitations in future releases. -------------------------------------------------------------------------------- Please report bugs, wishes, and suggestions to .