Subsections

42.3 Running multithreaded FLASH

42.3.1 OpenMP variables

There are OpenMP environmental variables which control the threaded runtime. The most frequntly used is OMP_NUM_THREADS which controls the number of OpenMP threads in each parallel region. It should be exported to your environment from your current shell. In the following example we specify that each OpenMP parallel region will be executed by 4 threads (assuming bash shell)

 

export OMP_NUM_THREADS=4

You may also need to set OMP_STACKSIZE to increase the stack size of the threads created by the OpenMP runtime. This is because some test problems exceed the default limit which leads to a segmentation fault from an innocent piece of source code. In practice we have only encountered this situation in a White Dwarf Supernova simulation with block list threading on a machine which had a default stack size of 4MB. For safety we now set a default value of 16MB for all of our runs to provide plenty of stack space for each thread.

 

export OMP_STACKSIZE=16M

If you need to use a job submission script to run FLASH on compute nodes then you should export the OpenMP variables in your job submission script.


42.3.2 FLASH variables

There are FLASH runtime parameters which can be set to .true. or .false. to turn on/off the parallel regions in different units. They are

Unit Block list threading Within block threading Ray trace threading
Hydro threadHydroBlockList threadHydroWithinBlock N/A
EOS N/A threadEosWithinBlock N/A
Multipole threadMpoleBlockList threadMpoleWithinBlock N/A
Energy Deposition N/A N/A threadRayTrace

There is no parameter for block list threading of EOS because, quite simply, there is no block list in EOS. In many FLASH simulations the EOS subroutines are only called from the hydrodynamic subroutines which means the EOS will be called in parallel if you thread the Hydro unit block list loop.

In general you do not need to worry about these runtime parameters because the default values will be set according to the setup variable passed to the setup script. This means that if you setup a Sedov simulation with block list threading then threadHydroBlockList will default to .true. and threadHydroWithinBlock will default to .false.. The runtime control of parallel regions may be useful if we ever need to investigate a possible multithreaded bug in a particular unit.

42.3.3 FLASH constants

There are FLASH constant runtime parameters which describe the threading strategies included in your FLASH application. The constants are named threadBlockListBuild, threadWithinBlockBuild and threadRayTraceBuild. We query these constants in the unit initialization file (e.g. Hydro_init.F90) to verify that your FLASH application supports the parallel regions requested by the runtime parameters in 42.3.2. If the parallel regions are not supported you will get a warning message in your FLASH log file and the parallel regions will be disabled. If this happens you can still use the requested parallel regions but you must re-setup FLASH with the appropriate threading strategy. In general you do not need to worry about the existence of these constants because if you setup the application with, say, block list threading it only makes sense for you to adjust the block list threading runtime parameters in your flash.par.

A nice property of using constant runtime parameters instead of pre-processor defines is that -noclobber rebuilds are extremely fast. This means you can resetup a threadBlockList application with threadWithinBlock and only rebuild the handful of source files that actually changed. If we used pre-processor defines to identify the thread strategy then we would have to rebuild every single file because any file could make use of the define. We actually take advantage of the forced rebuild from changing/adding/removing a pre-processor define when we switch from a threaded to a non-threaded application. We do this because OpenMP directives are conditionally compiled when a compiler option such as -fopenmp (GNU) is added to the compile line. Any source file could contain OpenMP and so we must do a complete rebuild to ensure correctness. We force a complete rebuild by defining FLASH_OPENMP in Flash.h for any type of multithreaded build.