[FLASH-USERS] Improving the load balancing in FLASH

Anshu Dubey dubey at flash.uchicago.edu
Tue Sep 8 09:56:45 EDT 2009


I suppose if you have an extremely unbalanced oct-tree that could work out.

But I'd have to work through an example to be fully convinced of it, I
have a feeling we will find that the factor of two constraint will
limit much of gain to be had by subcycling.

Anshu

On Tue, Sep 8, 2009 at 8:43 AM, Mike
Zingale<mzingale at scotty.ess.sunysb.edu> wrote:
> Brian is right -- the situation is entirely problem dependent.  For cases
> where you are locally refining only a small region of the domain, subcycling
> in time can give you a big boost.  Jonathan and I considered cases that were
> typical to the applications that the FLASH center was working on at the time
> -- large regions were typically refined.
>
> Mike
>
>
> On Tue, 8 Sep 2009, Brian O'Shea wrote:
>
>> In all fairness, the statement that Anshu made depends a lot on the type
>> of problem you are using.  For example, the Enzo AMR code is mainly used for
>> astrophysical applications that are gravity-dominated, so most of the action
>> takes place in a very small fraction of the simulation volume, and can use
>> many levels of refinement.  The only way that this sort of problem is
>> tractable is by using adaptive time-stepping.  One of the main complications
>> of such an adaptive time-stepping scheme is keeping all of the mesh levels
>> synched up.  It also makes load-balancing a significant challenge - one is
>> forced to distribute grids on a level-by-level basis, and if only a small
>> fraction of the volume is refined (as in our cosmology simulations...) you
>> destroy grid locality, which increases communication and decreases
>> scalability.
>>
>> Brian
>>
>>> This was considered several years ago. Then Mike Zingale and Jonathan
>>> Dursi proved that the computational savings from coarse blocks having
>>> a coarse time step are too insignificant to be worth pursuing. This is
>>> because the computational time in the finer blocks completely
>>> dominates that in coarser blocks for the same area of computational
>>> domain. I believe that study appeared in some proceedings, if you are
>>> interested I can dig up the reference.
>>>
>>> Anshu
>>>
>>> On Tue, Sep 8, 2009 at 4:17 AM, Ross Parkin<phy1erp at leeds.ac.uk> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I've been thinking about the load balancing method used in FLASH. The
>>>> timestep used by the hydro units is the same on all refinement levels
>>>> and so
>>>> blocks are essentially distributed by paramesh so that all processors
>>>> have
>>>> the roughly the smae number of blocks. Has anyone tried modifying FLASH
>>>> so
>>>> that each refinement level has its own timestep? To do this properly you
>>>> would need to modify how PARAMESH distributes blocks amongst processors,
>>>> taking account of the relative work done by each block on each
>>>> refinement
>>>> level, i.e. a block on a finer refinement level may need to do twice as
>>>> many
>>>> hydro loops before its simulation time equals that of a coarser block.
>>>>
>>>> Any ideas guys?
>>>>
>>>> Cheers,
>>>>
>>>> Ross Parkin
>>>>
>>
>
>
> -----------------------------------------------------------------------------
> Michael Zingale (mzingale at mail.astro.sunysb.edu)
> Assistant Professor
>
> Dept. of Physics and Astronomy     office: ESS 440
> Stony Brook University             phone:  631-632-8225
> Stony Brook, NY 11794-3800         web: http://www.astro.sunysb.edu/mzingale
> -----------------------------------------------------------------------------
>
>



More information about the flash-users mailing list