[FLASH-USERS] Load Balancing with the Burn Unit
Sean Couch
smc at flash.uchicago.edu
Fri Apr 25 15:57:16 EDT 2014
Hi James,
I’ve done this before with what I think was good success. It’s a pretty straight-forward, if hack-tastic, modification. What I did was to modify bn_burner.F90 such that the variable xoktot became a counter of the max number of burn sub-steps of any zone in a given block. I use this as an estimate of the “work” required to advance the block and this is what will go into the Morton curve weighting. So around line 198 in bn_burner.F90 I had:
Line 198
Line 198
end if end if
xoktot = xoktot + real(nok) !!xoktot = xoktot + real(nok)
!! Custom usage by SMC
xoktot = max(xoktot, real(nok))
xbadtot = xbadtot + real(nbad) xbadtot = xbadtot + real(nbad)
I then changed Burn.F90 to simply reset this counter:
Line 54
Line 54
use Burn_data, ONLY: bn_nuclearTempMin, bn_nuclearTempMax, bn_nuclearDensMin, & use Burn_data, ONLY: bn_nuclearTempMin, bn_nuclearTempMax, bn_nuclearDensMin, &
& bn_nuclearDensMax, bn_nuclearNI56Max, bn_useShockBurn, & & bn_nuclearDensMax, bn_nuclearNI56Max, bn_useShockBurn, &
& bn_smallx, bn_useBurn, bn_meshMe & bn_smallx, bn_useBurn, bn_meshMe, xoktot
use bn_interface, ONLY : bn_mapNetworkToSpecies, bn_burner use bn_interface, ONLY : bn_mapNetworkToSpecies, bn_burner
Line 114
Line 114
! start the timer ticking ! start the timer ticking
call Timers_start("burn") call Timers_start("burn")
! Restart counter
xoktot = 0.0
! make sure that guardcells are up to date ! make sure that guardcells are up to date
if (.NOT. bn_useShockBurn) then if (.NOT. bn_useShockBurn) then
call Grid_fillGuardCells(CENTER, ALLDIR) call Grid_fillGuardCells(CENTER, ALLDIR)
endif endif
Now in Burn_computeDt.F90 I accessed some private Grid data (shame on me, but hey, I said it was hack-tastic):
Line 79
Line 79
solnData, & solnData, &
dt_burn, dt_minloc) dt_burn, dt_minloc)
use Burn_data, ONLY: bn_enucDtFactor, bn_useBurn, bn_meshMe use Burn_data, ONLY: bn_enucDtFactor, bn_useBurn, bn_meshMe, xoktot
use Driver_interface, ONLY : Driver_abortFlash use Driver_interface, ONLY : Driver_abortFlash
use tree, ONLY : bflags
implicit none implicit none
#include "constants.h" #include "constants.h"
Line 130
Line 131
! the inverse of what we want, and then only (un)invert that inverse ! the inverse of what we want, and then only (un)invert that inverse
! if it is a reasonable number. ! if it is a reasonable number.
energyRatioInv = abs(solnData(ENUC_VAR,i,j,k)) / eint_zone energyRatioInv = abs(solnData(ENUC_VAR,i,j,k)) / eint_zone
!if (energyRatioInv > 1.0e-1) bflags(1,blockid) = 4.0
bflags(1,blockid) = xoktot
#ifdef DTBN_VAR
solnData(DTBN_VAR,i,j,k) = xoktot
#endif
if (energyRatioInv > dt_tempInv) then if (energyRatioInv > dt_tempInv) then
dt_tempInv = energyRatioInv dt_tempInv = energyRatioInv
dt_temp = 1.0 / energyRatioInv dt_temp = 1.0 / energyRatioInv
You can ignore DTBN_VAR. That was just a variable I used for diagnostics. The key is the bflags array. That comes from the PARAMESH tree data module and basically does nothing. It’s just a handy array for this, PARAMESH does nothing else with it.
Then, finally, in source/Grid/GridMain/paramesh/paramesh4/Paramesh4dev/PM4_package/mpi_source/mpi_amr_refine_derefine.F90 I added the following:
Line 309
Line 309
work_block(:) = 0. work_block(:) = 0.
Do i = 1,lnblocks Do i = 1,lnblocks
if (nodetype(i).eq.1) then if (nodetype(i).eq.1) then
work_block(i) = 2. !<<< USER EDIT ! work_block(i) = 2. !<<< USER EDIT
work_block(i) = max(2.,float(bflags(1,i))) !<<< by SMC
#ifdef FLASH_DEBUG_AMR #ifdef FLASH_DEBUG_AMR
lnblocks_leaf = lnblocks_leaf + 1 lnblocks_leaf = lnblocks_leaf + 1
#endif #endif
This sets the Morton curve weighting parameter for the block to the maximum number of burn sub-steps that were required for advancement. In my limited experimentation, this straightened out the Morton curve in regions of rapid burning nicely. Fewer ‘burning’ blocks per MPI rank. Increased the efficiency of the simulations I was running quite a bit. Caveat emptor: this all could use some tweaking for your particular application and YMMV.
Best regards,
Sean
--------------------------------------------------------
Sean M. Couch
Hubble Fellow
Flash Center for Computational Science
Department of Astronomy & Astrophysics
The University of Chicago
5747 S Ellis Ave, Jo 315
Chicago, IL 60637
(773) 702-3899
www.flash.uchicago.edu/~smc
On Apr 22, 2014, at 3:00 PM, James Guillochon <jguillochon at cfa.harvard.edu> wrote:
> Hi all, I'm running a simulation using one of the burning networks, which unfortunately is leading to a runtime efficiency of 35% as only ~10% of blocks are above the burning network thresholds, but those blocks take several times longer than non-burning regions.
>
> I noticed that the only weighting done currently is to give leaf blocks a factor of "2", and everything else "1". I think it would be relatively easy to count cells that have burned in a block, and then add an additional work factor to account for this overhead (say number of cells burned times a constant, with a block in which all cells are burning being a factor of 5-10 times more expensive). Ideally I'd want FLASH's efficiency to be 80%+ no matter what the Burn unit is doing.
>
> My question is in implementation: Would it make sense to add to the "work_block" (which is in the "tree" module) scaling factor directly in the Burn unit? Or is this the wrong place in the code to make this change?
>
> Thanks!
> - James
>
> --
> James Guillochon
> Einstein Fellow at the Harvard-Smithsonian CfA
> jguillochon at cfa.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20140425/283f946f/attachment-0001.htm>
More information about the flash-users
mailing list