5.8 Setup a hybrid MPI+OpenMP FLASH application

There is the experimental inclusion of FLASH multithreading with OpenMP in the FLASH4 beta release. The units which have support for multithreading are split hydrodynamics 14.1.2, unsplit hydrodynamics 14.1.3, Gamma law and multigamma EOS 16.2, Helmholtz EOS 16.3, Multipole Poisson solver (improved version (support for 2D cylindrical and 3D cartesian)) 8.10.2.2 and energy deposition 17.4.

The FLASH multithreading requires a MPI-2 installation built with thread support (building with an MPI-1 installation or an MPI-2 installation without thread support is possible but strongly discouraged). The FLASH application requests the thread support level MPI_THREAD_SERIALIZED to ensure that the MPI library is thread-safe and that any OpenMP thread can call MPI functions safely. You should also make sure that your compiler provides a version of OpenMP which is compliant with at least the OpenMP 2.5 (200505) standard (older versions may also work but I have not checked).

In order to make use of the multithreaded code you must setup your application with one of the setup variables threadBlockList, threadWithinBlock or threadRayTrace equal to True, e.g.

 

./setup Sedov -auto threadBlockList=True
./setup Sedov -auto threadBlockList=True +mpi1 (compatible with MPI-1 - unsafe!)

When you do this the setup script will insert USEOPENMP = 1 instead of USEOPENMP = 0 in the generated Makefile. If it is equal to $ 1$ the Makefile will prepend an OpenMP variable to the FFLAGS, CFLAGS, LFLAGS variables.

Makefile.h variables: In general you should not define FLAGS, CFLAGS and LFLAGS in your Makefile.h. It is much better to define FFLAGS_OPT, FFLAGS_TEST, FFLAGS_DEBUG, CFLAGS_OPT, CFLAGS_TEST, CFLAGS_DEBUG, LFLAGS_OPT, LFLAGS_TEST and LFLAGS_DEBUG in your Makefile.h. The setup script will then initialize the FFLAGS, CFLAGS and LFLAGS variables in the Makefile appropriately for an optimized, test or debug build.

The OpenMP variables should be defined in your Makefile.h and contain a compiler flag to recognize OpenMP directives. In most cases it is sufficient to define a single variable named OPENMP, but you may encounter special situations when you need to define OPENMP_FORTRAN, OPENMP_C and OPENMP_LINK. If you want to build FLASH with the GNU Fortran compiler gfortran and the GNU C compiler gcc then your Makefile.h should contain

 

OPENMP = -fopenmp

If you want to do something more complicated like build FLASH with the Lahey Fortran compiler lf90 and the GNU C compiler gcc then your Makefile.h should contain

 

OPENMP_FORTRAN = -openmp -Kpureomp
OPENMP_C = -fopenmp
OPENMP_LINK = -openmp -Kpureomp

When you run the hybrid FLASH application it will print the level of thread support provided by the MPI library and the number of OpenMP threads in each parallel region

 

[Driver_initParallel]: Called MPI_Init_thread - requested level   2, given level   2
[Driver_initParallel]: Number of OpenMP threads in each parallel region  4

Note that the FLASH application will still run if the MPI library does not provide the requested level of thread support, but will print a warning message alerting you to an unsafe level of MPI thread support. There is no guarantee that the program will work! I strongly recommend that you stop using this FLASH application - you should build a MPI-2 library with thread support and then rebuild FLASH.

We record extra version and runtime information in the FLASH log file for a threaded application. Table 5.7 shows log file entries from a threaded FLASH application along with example safe and unsafe values. All cells colored red show unsafe values.


Table 5.7: Log file entries showing safe and unsafe threaded FLASH applications
Log file stamp safe unsafe (1) unsafe (2) unsafe (3)
Number of MPI tasks: 1 1 1 1
MPI version: 2 Red 1 2 2
MPI subversion: 2 2 1 2
MPI thread support: T Red F Red F Red F
OpenMP threads/MPI task: 2 2 2 2
OpenMP version: 200805 200505 200505 200805
Is “_OPENMP” macro defined: T T T Red F


The FLASH applications in Table 5.7 are unsafe because

  1. we are using an MPI-1 implementation.

  2. we are using an MPI-2 implementation which is not built with thread support - the “MPI thread support in OpenMPI” Flash tip may help.

  3. we are using a compiler that does not define the macro _OPENMP when it compiles source files with OpenMP support (see OpenMP standard). I have noticed that Absoft 64-bit Pro Fortran 11.1.3 for Linux x86_64 does not define this macro. We use this macro in Driver_initParallel.F90 to conditionally initialize MPI with MPI_Init_thread. If you find that _OPENMP is not defined you should define it in your Makefile.h in a manner similar to the following:  
    OPENMP_FORTRAN = -openmp -D_OPENMP=200805
    

MPI thread support in OpenMPI: A default installation of OpenMPI-1.5 (and earlier) does not provide any level of MPI thread support. To include MPI thread support you must configure OpenMPI-1.5 with -enable-mpi-thread-multiple or -enable-opal-multi-threads. We prefer to configure with -enable-mpi-thread-multiple so that we can (in future) use the highest level of thread support. The configure option is named -enable-mpi-threads in earlier versions of OpenMPI.

MPI-IO issues when using a threaded FLASH application: The ROMIO in older versions of MPICH2 and OpenMPI is known to be buggy. We have encountered a segmentation fault on one platform and a deadlock on another platform during MPI-IO when we used OpenMPI-1.4.4 with a multithreaded FLASH application. We solved the error by using OpenMPI-1.5.4 (it should be possible to use OpenMPI-1.5.2 or greater because the release notes for OpenMPI-1.5.2 state “- Updated ROMIO from MPICH v1.3.1 (plus one additional patch).”. We have not tested to find the minimum version of MPICH2 but MPICH2-1.4.1p1 works fine. If it is not possible to use a newer MPI implementation you can avoid MPI-IO altogether by setting up your FLASH application with +serialIO.

You should not setup a FLASH application with both threadBlockList and threadWithinBlock equal to True - nested OpenMP parallelism is not supported. For further information about FLASH multithreaded applications please refer to Chapter 42.