[FLASH-USERS] Some problemes with FLASH3.1.1 Jeans (was Re: Negative/zero densities ...)

Seyit Hocuk seyit at astro.rug.nl
Wed Apr 29 12:25:27 EDT 2009


Thanks Tomek,

You really have a good understanding of these things. I always worry 
about unknown unknowns, as I like to call them. You have to be very 
carefull. Like a colleague of mine says "Even one dot matters a lot". I 
wish I had all the time to check everything.

Seyit




Tomasz Plewa wrote:
> Seyit -
>
> The following has potential to mask real problems:
>
> - use of thresholds ("small" parameters)
> - use of hybrid energy formulation (eint_switch or eintSwitch)
> - simulation passing a critical (crash) point upon restart
>
> There perhaps a few more of this kind. The last case, for example, 
> likely indicates that either the code restart does not really work or 
> else the code memory becomes clogged and results in accessing invalid 
> data (flushed out upon restart). The former is generally harmful, the 
> latter is an implementation problem, both needs fixing.
>
> FLASH is currently relatively densely populated with knobs, switches, 
> and special application-related features potentially masking bugs. I 
> expect the number of such extra features to grow with the growing 
> number of applications. This is perhaps not very different from other 
> codes. Every use of such features should be scrutinized, 
> well-justified, and documented. Otherwise it is (very) difficult to 
> trust the results. For example, responsibility for using thresholds 
> could be delegated to the user by setting their default values to 
> zero. Adding a paragraph discussing practice of using any extra 
> features already in quickstart section would not hurt.
>
> Obviously the code might be crashing for many various reasons. It is a 
> complex machinery and can be easily misused (e.g. overly relaxed 
> refinement criteria). In a sense, for real applications it does have a 
> steep learning curve.
>
> Tomek
> --
> Seyit Hocuk wrote:
>> Thanks Klaus, for you long and useful explanations.
>>
>> I had two problems looking similar. One was crashing due to 
>> negative/zero values when using cooling/heating. And the other was 
>> showing of all these warnings. Jeans' refinement was also problematic, 
>> although I am not using that anymore, though great that you solved it. 
>> And a great way by patching a diff file, I didn't know that was possible 
>> :), so simple. Anybody else that uses or starts with Jeans standard 
>> setup, should use Klaus' great patch!
>>
>> The first problem, I had already figured it out by the way. Like you 
>> also mentioned, there were some mistakes with lb and blocklist(lb) or 
>> blockID and blocklist(blockID), resulting in trying to cool/heat some 
>> parent/ancestor blocks, which can be negative and thus crashing.
>>
>> The second warnings problem in sanitize, I did the two things like you 
>> said, but I still have a lot of warnings that go below my small values, 
>> no more negative though! This could be a natural thing, due to 
>> cooling/shocks/adiabatic cooling or any other physical thing. Lowering 
>> small doesn't solve this. Whatever that value is, I always have some 
>> cells with the minimum values somehow. This can change suddenly from one 
>> timestep to another timestep. I experience that it is very dependant on 
>> the switch parameter, i.e., changing the way internal energy is 
>> calculated. And I do have quite regular crashes (non convergence in 
>> subroutine rieman) that might be related to this. Sometimes one or few 
>> values just go crazy without any reason and by changing almost anything 
>> just slightly (switch, hybrid_rieman, refmax/min, treshold, etc.) somhow 
>> avoids it after restart.
>>
>> Anyway, I think you solved to suppress the warnings that were coming 
>> from "sanitizing" non-LEAF blocks. However since I still have a lot of 
>> these annoying warnings, I completely commented the call statements for 
>> sanitize. I will follow your suggestion on using Convert...ForMesh..., a 
>> long name blah ;).
>>
>> Kind Regards,
>> Seyit
>>
>>
>>
>> Klaus Weide wrote:
>>   
>>> On Tue, 14 Apr 2009, Seyit Hocuk wrote:
>>>
>>>   
>>>     
>>>> Hi Anshu,
>>>>
>>>> Thanks for your reply. Of the solutions you proposed for the warnings (not on
>>>> the crash), I have a few remarks.
>>>>
>>>>     
>>>>       
>>> [Anshu:]
>>>   
>>>     
>>>>> To reduce or suppress the
>>>>> annoying messages, you could to either of two things:
>>>>> 1) Remove or comment out the line '#define DEBUG_CONSCONV'
>>>>>  in gr_sanitizeDataAfterInterp.F90.  You should then see only
>>>>>  messages with one line per block / variable on standard output, no
>>>>>  dumping of the blocks contents.
>>>>> 2) Change gr_sanitizeDataAfterInterp so it only checks leaf blocks, not
>>>>>  leaf and parent blocks. That is, you might suppressing the checking
>>>>>  or the output if (nodetype(block)==2) .
>>>>>
>>>>>       
>>>>>         
>>>> Doing 1) still gives the warnings although like you say at a reduced amount.
>>>> With 2) you must mean if "nodetype(block)==1" for the leaf/child blocks.
>>>>     
>>>>       
>>> Seyit,
>>>
>>> That part of Anshu's message should have read:
>>>   
>>>     
>>>>> 2) Change gr_sanitizeDataAfterInterp so it only checks leaf blocks, not
>>>>>  leaf and parent blocks. That is, you might suppress the checking
>>>>>  or the output if (nodetype(block)==2) .
>>>>>       
>>>>>         
>>> "Suppressing checking/output if (nodetype(block)==2)" was intended to mean 
>>> "doing the checking/output only if (nodetype(block)==1)". Sorry if this 
>>> was unclear (I originally wrote that part of the msg.)
>>>
>>>   
>>>     
>>>> However, that also does not completely remove the issue. The values can still
>>>> fall below the "small" values. I will still stick with the old
>>>> "convertToConsvdForMeshCalls" parameter.
>>>>     
>>>>       
>>> I recommend that you disable convertToConsvdForMeshCalls, enable
>>> convertToConsvdInMeshInterp (as per default), and see my previous
>>> message from earlier today for a way to avoid misleading warnings;
>>> in particular, change gr_hgSolve.F90 as indicated there.
>>>
>>> If, after doing that, you again see warnings about min. dens (or ener, or 
>>> eint) values below their "small" lower thresholds, then this would be
>>> reason for real concern - except PERHAPS if it happens only occasionally
>>> and/or there is only slight undershooting of the thresholds, not negative
>>> values.  Your should then try to understand why interpolation during guard
>>> cell filling would generate such values in your simulations, rather than
>>> just trying to find a combination of runtime parameters that avoids the
>>> checks being done.
>>>
>>> Klaus
>>>   
>>>     
>>
>>
>>
>> Klaus Weide wrote:
>>   
>>> Flash Users,
>>>
>>> This message is to explain, and help circumvent, some of the problems 
>>> that came up earlier this month with the Jeans problem ins FLASH 3.1.1.
>>> Some of this will probably apply to other problems that were brought 
>>> up in the same context. In particular, private copies of 
>>> Grid_markRefineDerefine should be checked for errors similar to those 
>>> described below (near the end).
>>>
>>> Seyit,
>>>
>>> After applying all three runtime parameter changes, I was indeed also 
>>> able to reproduce the warning messages you had reported getting from 
>>> Jeans.
>>>
>>> As you already seem to have found out yourself, the 
>>> Grid_markRefineDerefine.F90 provided with the Jeans problem is buggy.  
>>> A patch is attached.  The original version resulted (with your changed 
>>> runtime parameters) in a pattern where a region of the domain was left
>>> at a coarser refinement than the remainder, without a physical reason.
>>> This showed up some further problems that become obscured after 
>>> Grid_markRefineDerefine is fixed. Let me therefore first discuss
>>> those problems.
>>>
>>> On Thu, 9 Apr 2009, Seyit Hocuk wrote:
>>>     
>>>> I also set convertToConsvdForMeshCalls to .True. in runtime 
>>>> parameters instead of using convertToConsvdInMeshInterp, which should 
>>>> have worked better because it is supposed to avoid spurius things in 
>>>> paramesh3+. However convertToConsvdInMeshInterp calls at some point 
>>>> gr_sanitizeDataAfterInterp, which creates all the weirdness. It is my 
>>>> humble opinion that something goes wrong with conservation there.
>>>>       
>>> It is not (runtime parameter) convertToConsvdInMeshInterp or 
>>> (subroutine) gr_sanitizeDataAfterInterp that created any problems. 
>>> gr_sanitizeDataAfterInterp (despite its name) only checks and reports 
>>> some problems with data, it does not modify any data; and it happens 
>>> to be called only when convertToConsvdInMeshInterp is TRUE. Now 
>>> gr_sanitizeDataAfterInterp was checking maybe a bit too aggressively. 
>>> That is, as Anshu has already written, it was checking guard cells in 
>>> PARENT blocks for plausible values even though the values in those 
>>> cells cannot impact the propagation of solutions *on LEAF blocks* at 
>>> all.  And, as has just been also pointed out on FLASH-USERS, FLASH (as 
>>> provided) only cares about evolving solutions on leaf blocks.
>>>
>>> At the point where the checking occurs, these guard cells have indeed
>>> just been updated by PARAMESH, by copying values from a neighboring
>>> block; so they should have valid data IF the neighbor did.  But if
>>> this neighbor of the PARENT (nodetype==2) block is an ANCESTOR
>>> (nodetype==3) block, its data may not be valid.  (It can be shown that
>>> in THIS situation at a refinement boundary, the resulting PARENT cell
>>> values do not impact solutions on the child (nodetype==1 aka LEAF)
>>> blocks, even though in other cases PARENT cell values can and do
>>> impact child solutions via interpolation/prolongation.)
>>>
>>> Normally during most simulations, ANCESTOR blocks always contain 
>>> reasonably-looking values for all variables in their interior cells, 
>>> even though FLASH doesn't explicitly evolve ANCESTOR (or PARENT) 
>>> blocks in its solvers.  These values either come from the initial 
>>> values established when Simulation_initBlock was called during 
>>> initialization, or from higher-resolution children via 
>>> LEAF->PARENT->ANCESTOR restriction done before checkpoints or plot 
>>> files are written. They may therefore be out of date with respect to 
>>> the current simulation time, but will not trigger 
>>> gr_sanitizeDataAfterInterp warnings.
>>>
>>> However, this does not apply to the Jeans simulation in question, as
>>> far as density DENS_VAR is concerned, because of a side effect of the
>>> Multigrid solver.  For certain kinds of boundary conditions, this
>>> solver temporarily modifies the density values in blocks by
>>> subtracting an offset, and later undoes this change by adding the same
>>> value.  There is, however, an inconsistency in the sets of blocks to
>>> which these changes are applied: the subtraction is applied to all
>>> blocks, the addition only to leaf blocks.  I would like to emphasize
>>> that this inconsistency causes no differences in results as long as
>>> one only expects meaningful solution data in leaf blocks; and the only 
>>> negative effects are unnecessary WARNINGs from gr_sanitizeDataAfterInterp
>>> as described above.
>>>
>>> The following one-line change will make the addition apply to the same
>>> set of blocks as the subtraction, and will thus avoid those warnings:
>>>
>>> --- source/Grid/GridSolvers/Multigrid/gr_hgSolve.F90    (revision 10484)
>>> +++ source/Grid/GridSolvers/Multigrid/gr_hgSolve.F90    (working copy)
>>> @@ -185,7 +185,7 @@
>>>    ! when using periodic/Neumann boundary conditions).
>>>
>>>    do m = 1, gr_hgMeshRefineMax
>>> -     call gr_hgLevelAddScalar(m, gr_iSource, gr_hgAvgSource, 
>>> MG_NODES_LEAF_ONLY) !oK
>>> +     call gr_hgLevelAddScalar(m, gr_iSource, gr_hgAvgSource, 
>>> MG_NODES_ALL_NODES) !oK
>>>    enddo
>>>
>>>    ! Leave boundary zones properly updated.
>>>
>>>
>>> (Ultimately gr_sanitizeDataAfterInterp checking should be changed so 
>>> that it
>>> only checks blocks (and cells) where valid data can always be expected.)
>>>
>>>  - * - * -
>>>
>>> Finally, attached is a patch that fixes a few things specific to the 
>>> Jeans
>>> simulation:
>>>
>>> -  Initialize internal energy properly in Simulation_initBlock.
>>> -  Fix inconsistent use of lb vs gr_blkList(lb), a common mistake.
>>> -  Changed LEAF -> ACTIVE_BLKS, the logic involving delta_max_par 
>>> requires it.
>>> -  Make sure delta_max_par, delta_max array elements are initialized.
>>>
>>> Apply in the top directory (one level above source) with
>>>
>>>   patch -p0 < forSeyit.diff
>>>
>>>
>>>
>>> Klaus
>>>     
>>
>>   




More information about the flash-users mailing list