[FLASH-USERS] Some problemes with FLASH3.1.1 Jeans (was Re: Negative/zero densities ...)
Seyit Hocuk
seyit at astro.rug.nl
Wed Apr 29 12:25:27 EDT 2009
Thanks Tomek,
You really have a good understanding of these things. I always worry
about unknown unknowns, as I like to call them. You have to be very
carefull. Like a colleague of mine says "Even one dot matters a lot". I
wish I had all the time to check everything.
Seyit
Tomasz Plewa wrote:
> Seyit -
>
> The following has potential to mask real problems:
>
> - use of thresholds ("small" parameters)
> - use of hybrid energy formulation (eint_switch or eintSwitch)
> - simulation passing a critical (crash) point upon restart
>
> There perhaps a few more of this kind. The last case, for example,
> likely indicates that either the code restart does not really work or
> else the code memory becomes clogged and results in accessing invalid
> data (flushed out upon restart). The former is generally harmful, the
> latter is an implementation problem, both needs fixing.
>
> FLASH is currently relatively densely populated with knobs, switches,
> and special application-related features potentially masking bugs. I
> expect the number of such extra features to grow with the growing
> number of applications. This is perhaps not very different from other
> codes. Every use of such features should be scrutinized,
> well-justified, and documented. Otherwise it is (very) difficult to
> trust the results. For example, responsibility for using thresholds
> could be delegated to the user by setting their default values to
> zero. Adding a paragraph discussing practice of using any extra
> features already in quickstart section would not hurt.
>
> Obviously the code might be crashing for many various reasons. It is a
> complex machinery and can be easily misused (e.g. overly relaxed
> refinement criteria). In a sense, for real applications it does have a
> steep learning curve.
>
> Tomek
> --
> Seyit Hocuk wrote:
>> Thanks Klaus, for you long and useful explanations.
>>
>> I had two problems looking similar. One was crashing due to
>> negative/zero values when using cooling/heating. And the other was
>> showing of all these warnings. Jeans' refinement was also problematic,
>> although I am not using that anymore, though great that you solved it.
>> And a great way by patching a diff file, I didn't know that was possible
>> :), so simple. Anybody else that uses or starts with Jeans standard
>> setup, should use Klaus' great patch!
>>
>> The first problem, I had already figured it out by the way. Like you
>> also mentioned, there were some mistakes with lb and blocklist(lb) or
>> blockID and blocklist(blockID), resulting in trying to cool/heat some
>> parent/ancestor blocks, which can be negative and thus crashing.
>>
>> The second warnings problem in sanitize, I did the two things like you
>> said, but I still have a lot of warnings that go below my small values,
>> no more negative though! This could be a natural thing, due to
>> cooling/shocks/adiabatic cooling or any other physical thing. Lowering
>> small doesn't solve this. Whatever that value is, I always have some
>> cells with the minimum values somehow. This can change suddenly from one
>> timestep to another timestep. I experience that it is very dependant on
>> the switch parameter, i.e., changing the way internal energy is
>> calculated. And I do have quite regular crashes (non convergence in
>> subroutine rieman) that might be related to this. Sometimes one or few
>> values just go crazy without any reason and by changing almost anything
>> just slightly (switch, hybrid_rieman, refmax/min, treshold, etc.) somhow
>> avoids it after restart.
>>
>> Anyway, I think you solved to suppress the warnings that were coming
>> from "sanitizing" non-LEAF blocks. However since I still have a lot of
>> these annoying warnings, I completely commented the call statements for
>> sanitize. I will follow your suggestion on using Convert...ForMesh..., a
>> long name blah ;).
>>
>> Kind Regards,
>> Seyit
>>
>>
>>
>> Klaus Weide wrote:
>>
>>> On Tue, 14 Apr 2009, Seyit Hocuk wrote:
>>>
>>>
>>>
>>>> Hi Anshu,
>>>>
>>>> Thanks for your reply. Of the solutions you proposed for the warnings (not on
>>>> the crash), I have a few remarks.
>>>>
>>>>
>>>>
>>> [Anshu:]
>>>
>>>
>>>>> To reduce or suppress the
>>>>> annoying messages, you could to either of two things:
>>>>> 1) Remove or comment out the line '#define DEBUG_CONSCONV'
>>>>> in gr_sanitizeDataAfterInterp.F90. You should then see only
>>>>> messages with one line per block / variable on standard output, no
>>>>> dumping of the blocks contents.
>>>>> 2) Change gr_sanitizeDataAfterInterp so it only checks leaf blocks, not
>>>>> leaf and parent blocks. That is, you might suppressing the checking
>>>>> or the output if (nodetype(block)==2) .
>>>>>
>>>>>
>>>>>
>>>> Doing 1) still gives the warnings although like you say at a reduced amount.
>>>> With 2) you must mean if "nodetype(block)==1" for the leaf/child blocks.
>>>>
>>>>
>>> Seyit,
>>>
>>> That part of Anshu's message should have read:
>>>
>>>
>>>>> 2) Change gr_sanitizeDataAfterInterp so it only checks leaf blocks, not
>>>>> leaf and parent blocks. That is, you might suppress the checking
>>>>> or the output if (nodetype(block)==2) .
>>>>>
>>>>>
>>> "Suppressing checking/output if (nodetype(block)==2)" was intended to mean
>>> "doing the checking/output only if (nodetype(block)==1)". Sorry if this
>>> was unclear (I originally wrote that part of the msg.)
>>>
>>>
>>>
>>>> However, that also does not completely remove the issue. The values can still
>>>> fall below the "small" values. I will still stick with the old
>>>> "convertToConsvdForMeshCalls" parameter.
>>>>
>>>>
>>> I recommend that you disable convertToConsvdForMeshCalls, enable
>>> convertToConsvdInMeshInterp (as per default), and see my previous
>>> message from earlier today for a way to avoid misleading warnings;
>>> in particular, change gr_hgSolve.F90 as indicated there.
>>>
>>> If, after doing that, you again see warnings about min. dens (or ener, or
>>> eint) values below their "small" lower thresholds, then this would be
>>> reason for real concern - except PERHAPS if it happens only occasionally
>>> and/or there is only slight undershooting of the thresholds, not negative
>>> values. Your should then try to understand why interpolation during guard
>>> cell filling would generate such values in your simulations, rather than
>>> just trying to find a combination of runtime parameters that avoids the
>>> checks being done.
>>>
>>> Klaus
>>>
>>>
>>
>>
>>
>> Klaus Weide wrote:
>>
>>> Flash Users,
>>>
>>> This message is to explain, and help circumvent, some of the problems
>>> that came up earlier this month with the Jeans problem ins FLASH 3.1.1.
>>> Some of this will probably apply to other problems that were brought
>>> up in the same context. In particular, private copies of
>>> Grid_markRefineDerefine should be checked for errors similar to those
>>> described below (near the end).
>>>
>>> Seyit,
>>>
>>> After applying all three runtime parameter changes, I was indeed also
>>> able to reproduce the warning messages you had reported getting from
>>> Jeans.
>>>
>>> As you already seem to have found out yourself, the
>>> Grid_markRefineDerefine.F90 provided with the Jeans problem is buggy.
>>> A patch is attached. The original version resulted (with your changed
>>> runtime parameters) in a pattern where a region of the domain was left
>>> at a coarser refinement than the remainder, without a physical reason.
>>> This showed up some further problems that become obscured after
>>> Grid_markRefineDerefine is fixed. Let me therefore first discuss
>>> those problems.
>>>
>>> On Thu, 9 Apr 2009, Seyit Hocuk wrote:
>>>
>>>> I also set convertToConsvdForMeshCalls to .True. in runtime
>>>> parameters instead of using convertToConsvdInMeshInterp, which should
>>>> have worked better because it is supposed to avoid spurius things in
>>>> paramesh3+. However convertToConsvdInMeshInterp calls at some point
>>>> gr_sanitizeDataAfterInterp, which creates all the weirdness. It is my
>>>> humble opinion that something goes wrong with conservation there.
>>>>
>>> It is not (runtime parameter) convertToConsvdInMeshInterp or
>>> (subroutine) gr_sanitizeDataAfterInterp that created any problems.
>>> gr_sanitizeDataAfterInterp (despite its name) only checks and reports
>>> some problems with data, it does not modify any data; and it happens
>>> to be called only when convertToConsvdInMeshInterp is TRUE. Now
>>> gr_sanitizeDataAfterInterp was checking maybe a bit too aggressively.
>>> That is, as Anshu has already written, it was checking guard cells in
>>> PARENT blocks for plausible values even though the values in those
>>> cells cannot impact the propagation of solutions *on LEAF blocks* at
>>> all. And, as has just been also pointed out on FLASH-USERS, FLASH (as
>>> provided) only cares about evolving solutions on leaf blocks.
>>>
>>> At the point where the checking occurs, these guard cells have indeed
>>> just been updated by PARAMESH, by copying values from a neighboring
>>> block; so they should have valid data IF the neighbor did. But if
>>> this neighbor of the PARENT (nodetype==2) block is an ANCESTOR
>>> (nodetype==3) block, its data may not be valid. (It can be shown that
>>> in THIS situation at a refinement boundary, the resulting PARENT cell
>>> values do not impact solutions on the child (nodetype==1 aka LEAF)
>>> blocks, even though in other cases PARENT cell values can and do
>>> impact child solutions via interpolation/prolongation.)
>>>
>>> Normally during most simulations, ANCESTOR blocks always contain
>>> reasonably-looking values for all variables in their interior cells,
>>> even though FLASH doesn't explicitly evolve ANCESTOR (or PARENT)
>>> blocks in its solvers. These values either come from the initial
>>> values established when Simulation_initBlock was called during
>>> initialization, or from higher-resolution children via
>>> LEAF->PARENT->ANCESTOR restriction done before checkpoints or plot
>>> files are written. They may therefore be out of date with respect to
>>> the current simulation time, but will not trigger
>>> gr_sanitizeDataAfterInterp warnings.
>>>
>>> However, this does not apply to the Jeans simulation in question, as
>>> far as density DENS_VAR is concerned, because of a side effect of the
>>> Multigrid solver. For certain kinds of boundary conditions, this
>>> solver temporarily modifies the density values in blocks by
>>> subtracting an offset, and later undoes this change by adding the same
>>> value. There is, however, an inconsistency in the sets of blocks to
>>> which these changes are applied: the subtraction is applied to all
>>> blocks, the addition only to leaf blocks. I would like to emphasize
>>> that this inconsistency causes no differences in results as long as
>>> one only expects meaningful solution data in leaf blocks; and the only
>>> negative effects are unnecessary WARNINGs from gr_sanitizeDataAfterInterp
>>> as described above.
>>>
>>> The following one-line change will make the addition apply to the same
>>> set of blocks as the subtraction, and will thus avoid those warnings:
>>>
>>> --- source/Grid/GridSolvers/Multigrid/gr_hgSolve.F90 (revision 10484)
>>> +++ source/Grid/GridSolvers/Multigrid/gr_hgSolve.F90 (working copy)
>>> @@ -185,7 +185,7 @@
>>> ! when using periodic/Neumann boundary conditions).
>>>
>>> do m = 1, gr_hgMeshRefineMax
>>> - call gr_hgLevelAddScalar(m, gr_iSource, gr_hgAvgSource,
>>> MG_NODES_LEAF_ONLY) !oK
>>> + call gr_hgLevelAddScalar(m, gr_iSource, gr_hgAvgSource,
>>> MG_NODES_ALL_NODES) !oK
>>> enddo
>>>
>>> ! Leave boundary zones properly updated.
>>>
>>>
>>> (Ultimately gr_sanitizeDataAfterInterp checking should be changed so
>>> that it
>>> only checks blocks (and cells) where valid data can always be expected.)
>>>
>>> - * - * -
>>>
>>> Finally, attached is a patch that fixes a few things specific to the
>>> Jeans
>>> simulation:
>>>
>>> - Initialize internal energy properly in Simulation_initBlock.
>>> - Fix inconsistent use of lb vs gr_blkList(lb), a common mistake.
>>> - Changed LEAF -> ACTIVE_BLKS, the logic involving delta_max_par
>>> requires it.
>>> - Make sure delta_max_par, delta_max array elements are initialized.
>>>
>>> Apply in the top directory (one level above source) with
>>>
>>> patch -p0 < forSeyit.diff
>>>
>>>
>>>
>>> Klaus
>>>
>>
>>
More information about the flash-users
mailing list