[FLASH-BUGS] Species Flux Handling with USM MHD in FLASH 4.3
Jason Galyardt
jason.galyardt at gmail.com
Fri Dec 18 20:27:37 CST 2015
Klaus,
A partial answer is inline below. I'll have to get back to you on
experiments A) & B), once I have a chance to make the changes and run a
test. I ran Experiment C) (using PPM instead of MH reconstruction) earlier
today, having the same idea; it results in the same failure mode at the
same place.
Regards,
Jason
On Fri, Dec 18, 2015 at 5:38 PM, Klaus Weide <klaus at flash.uchicago.edu>
wrote:
> On Fri, 18 Dec 2015, Jason Galyardt wrote:
> Jason,
>
> I don't have enough information to reproduce this problem, and don't have
> access to the same compiler. Perhaps you can help narrow down the cause.
>
> Are you using OpenMP threading in your code?
>
No, I'm not using OpenMP.
> Is the compiler using some sort of automatic threading for arrays
> statements?
>
Not to my knowledge, I used the debug Makefile options.
> Could you try to lower the compiler optimization options and see whether
> the same problem still occurs?
>
The problem I reported occurs both in optimized and unoptomized versions;
the backtrace below was produced with the unoptomized binary. The only
difference in the two is that the backtrace associated with the optimized
binary refers to the next line in the same file (line 367).
>
> The line number points to the line
>
> Sr = Sc+0.5*delbarSp
>
> (do you have the same line at no. 66?)
>
Yes, this is what I have.
>
> which is an array assignment, with the arrays involved declared as
> follows:
>
> real,intent(OUT),dimension(HY_NSPEC), optional :: Sr
> real, dimension(HY_NSPEC), target :: Sc
> real, dimension(HY_NSPEC) :: delbarSp
>
Yes, I've got these too.
>
> The most obvious proximate cause of a SIGSEGV error here would be if
> the optional dummy argument Sr is not actually present in the call to
> hy_uhd_DataReconstructNormalDir_MH on line 535 of
> hy_uhd_dataReconstOneStep.F90, or if the actual argument is somehow
> invalid (under the hood, pointing to invalid memory).
>
Right; that was my thought as well, but as I said, I couldn't find an
obvious invalid memory access 'manually'. It might also be helpful to run
the process in a debugger; I'll see what I can do in that regard.
> Supposedly that will never be the case, but it is possible that the logic
> that should prevent it is faulty.
>
> It would be helpful if you could try both of the following experiments.
>
> A. In hy_uhd_unsplit.F90 -- make sure you edit the version of this
> file that actually gets used for your setup! -- find the lines
>
> #if (NSPECIES+NMASS_SCALARS) > 0
> if (hy_fullSpecMsFluxHandling .AND. hy_fullRiemannStateArrays &
> .AND. NFACE_VARS > 0 .AND. NDIM > 1) then
> ! for hy_spcR:
> call hy_memAllocScratch(CENTER,&
> 1,&
> hy_numXN,&
> 2,0,0, &
> blockList(1:blockCount), &
> highSize=NDIM)
> ! for hy_spcL:
> call hy_memAllocScratch(SCRATCH_CTR,&
> 1,&
> hy_numXN,&
> 2,0,0, &
> blockList(1:blockCount), &
> highSize=NDIM)
> end if
> #endif
>
> Change the two lines
> 2,0,0, &
> into
> 3,0,0, &
> or
> 4,0,0, &
> each. (This should result in allocating some auxiliary arrays with more
> space for guard cells, rather than trying to economize on memory space;
> slices of these auxiliary arrays are what eventually gets passed to
> hy_uhd_DataReconstructNormalDir_MH as optional arguments Sl and Sr.)
>
> B. Put a protection against accessing optionals arguments that are
> not present into hy_uhd_DataReconstructNormalDir_MH.F90.
> The region with the offending statement might end up looking like this:
>
> #if (NSPECIES+NMASS_SCALARS) > 0
> if (hy_fullSpecMsFluxHandling) then
> if (present(Sr) .AND. present(Sl)) then
> .....
> Sr = Sc+0.5*delbarSp
> .....
> endif
> endif ! (hy_fullSpecMsFluxHandling)
> #endif /* (NSPECIES+NMASS_SCALARS) > 0 */
>
>
> Finally, you could test whether this problem is somehow specific to the
> MUSCL-Hancock (MH) reconstruction method. In particular try (C.)
> order = 3
> in your parfile, so that
> hy_uhd_DataReconstructNormalDir_PPP
> will be called instead of
> hy_uhd_DataReconstructNormalDir_MH .
>
>
The same failure mode happens in both MH and PPM.
>
> > When I set hy_fullSpecMsFluxHandling=.false. in my flash.par file, the
> > crash does not happen (which is what I expect, looking at the source file
> > in question). Since this parameter defaults to .true., I would expect it
> to
> > play nice with USM. If it's behavior is expected to be unpredictable,
> > perhaps we need a warning to that effect (or simply set it to .false.
> when
> > USM is in use).
> >
> > I'm using the Intel Fortran compiler version 14.0.0.080. For reference,
> > here's my setup line:
> >
> > ./setup magnetoHD/SmithMotion -auto -3d -maxblocks=600 +usm
> > nonIdealMHD=True withDarkMatter=False gravMode=1 withNEI=True
> > -objdir=smithMotion_NoDM_nonIdealB_dbug -makefile=intel -debug
> >
> > Note that I've defined the setup flags nonIdealMHD, withDarkMatter,
> > gravMode, and withNEI in the Config file for my simulation.
>
> These setup variables must be doing something specific to your simulation.
> Hopefully you don't have code in SimulationMain/magnetoHD/SmithMotion
> (or code changes elsewhere) that would interfere with the internals of the
> Hydro unit.
>
Yes, these are simulation-specific setup variables. I am using a private
Gravity unit implementation, but I wouldn't guess that it should affect the
Hydro unit in this way (my code doesn't use pointers, so there really
shouldn't be any way that it could overwrite memory from another portion of
FLASH; correct me if I'm wrong). I have made some modifications to the
unsplit Hydro unit, but they were minor, modifying the calls to
hy_uhd_addGravityUnsplit() to use the full block (blkLimits ->
blkLimitsGC), for consistency with the calls to hy_uhd_putGravityUnsplit()
in hy_uhd_unsplit.F90:
!!!!!!!!!!!
! I changed the following, on line 1158:
! call hy_uhd_addGravityUnsplit(blockID,blkLimits,dataSize,dt,&
! gravX(:,:,:),gravY(:,:,:),gravZ(:,:,:))
! to use the full block (including guard cells):
call hy_uhd_addGravityUnsplit(blockID,blkLimitsGC,dataSize,dt,&
gravX(:,:,:),gravY(:,:,:),gravZ(:,:,:))
!!!!!!!!!!!!
I also changed the call to hy_uhd_energyFix() similarly (around line 1174).
This last change is probably unwarranted. When I get a chance, I'll try it
again without these changes to hy_uhd_unsplit.F90.
>
> Klaus
>
More information about the flash-bugs
mailing list