[FLASH-BUGS] Species Flux Handling with USM MHD in FLASH 4.3

Klaus Weide klaus at flash.uchicago.edu
Fri Dec 18 16:38:08 CST 2015


On Fri, 18 Dec 2015, Jason Galyardt wrote:

> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC                Routine            Line
> Source
> libintlc.so.5      00002AB348C8C1F9  Unknown               Unknown  Unknown
> libintlc.so.5      00002AB348C8AB70  Unknown               Unknown  Unknown
> libifcore.so.5     00002AB347AF7DDF  Unknown               Unknown  Unknown
> libifcore.so.5     00002AB347A5F4CF  Unknown               Unknown  Unknown
> libifcore.so.5     00002AB347A70BC3  Unknown               Unknown  Unknown
> libpthread.so.0    000000308D00ECA0  Unknown               Unknown  Unknown
> flash4             000000000112A509  hy_uhd_datarecons         366
> hy_uhd_DataReconstructNormalDir_MH.F90
> flash4             00000000012F58C4  hy_uhd_datarecons         535
> hy_uhd_dataReconstOneStep.F90
> flash4             00000000015AAD98  hy_uhd_getriemann         621
> hy_uhd_getRiemannState.F90
> flash4             00000000016AC695  hy_uhd_unsplit_           494
> hy_uhd_unsplit.F90
> flash4             000000000075A25D  hydro_                     67
> Hydro.F90
> flash4             00000000004A289D  driver_evolveflas         287
> Driver_evolveFlash.F90
> flash4             0000000000503EE1  MAIN__                     51
> Flash.F90
> flash4             000000000041A176  Unknown               Unknown  Unknown
> libc.so.6          000000308C41D9C4  Unknown               Unknown  Unknown
> flash4             000000000041A079  Unknown               Unknown  Unknown
> 
> I've checked line 366 of hy_uhd_DataReconstructNormalDir_MH.F90, but I
> couldn't find any obvious incidences of array indices being out of bounds.

Jason,

I don't have enough information to reproduce this problem, and don't have 
access to the same compiler.  Perhaps you can help narrow down the cause.

Are you using OpenMP threading in your code?
Is the compiler using some sort of automatic threading for arrays 
statements?
Could you try to lower the compiler optimization options and see whether 
the same problem still occurs?

The line number points to the line

           Sr = Sc+0.5*delbarSp

(do you have the same line at no. 66?)

which is an array assignment, with the arrays involved declared as 
follows:

   real,intent(OUT),dimension(HY_NSPEC),   optional :: Sr
   real, dimension(HY_NSPEC), target :: Sc
   real, dimension(HY_NSPEC) :: delbarSp

The most obvious proximate cause of a SIGSEGV error here would be if
the optional dummy argument Sr is not actually present in the call to 
hy_uhd_DataReconstructNormalDir_MH on line 535 of 
hy_uhd_dataReconstOneStep.F90, or if the actual argument is somehow 
invalid (under the hood, pointing to invalid memory).
Supposedly that will never be the case, but it is possible that the logic 
that should prevent it is faulty.

It would be helpful if you could try both of the following experiments.

A. In hy_uhd_unsplit.F90 -- make sure you edit the version of this
file that actually gets used for your setup! -- find the lines

#if (NSPECIES+NMASS_SCALARS) > 0
     if (hy_fullSpecMsFluxHandling .AND. hy_fullRiemannStateArrays &
          .AND. NFACE_VARS > 0 .AND. NDIM > 1) then
        ! for hy_spcR:
	call hy_memAllocScratch(CENTER,&
             1,&
             hy_numXN,&
             2,0,0, &
             blockList(1:blockCount), &
             highSize=NDIM)
        ! for hy_spcL:
	call hy_memAllocScratch(SCRATCH_CTR,&
             1,&
             hy_numXN,&
             2,0,0, &
             blockList(1:blockCount), &
             highSize=NDIM)
     end if
#endif

Change the two lines
             2,0,0, &
into
             3,0,0, &
or
             4,0,0, &
each. (This should result in allocating some auxiliary arrays with more 
space for guard cells, rather than trying to economize on memory space;
slices of these auxiliary arrays are what eventually gets passed to 
hy_uhd_DataReconstructNormalDir_MH as optional arguments Sl and Sr.)

B. Put a protection against accessing optionals arguments that are 
not present into hy_uhd_DataReconstructNormalDir_MH.F90.
The region with the offending statement might end up looking like this:

#if (NSPECIES+NMASS_SCALARS) > 0
     if (hy_fullSpecMsFluxHandling) then
      if (present(Sr) .AND. present(Sl)) then
        .....
        Sr = Sc+0.5*delbarSp
        .....
      endif
     endif ! (hy_fullSpecMsFluxHandling)                                                                                                                                           
#endif /*  (NSPECIES+NMASS_SCALARS) > 0 */


Finally, you could test whether this problem is somehow specific to the 
MUSCL-Hancock (MH) reconstruction method.  In particular try (C.)
    order = 3
in your parfile, so that
   hy_uhd_DataReconstructNormalDir_PPP
will be called instead of
   hy_uhd_DataReconstructNormalDir_MH .


> When I set hy_fullSpecMsFluxHandling=.false. in my flash.par file, the
> crash does not happen (which is what I expect, looking at the source file
> in question). Since this parameter defaults to .true., I would expect it to
> play nice with USM. If it's behavior is expected to be unpredictable,
> perhaps we need a warning to that effect (or simply set it to .false. when
> USM is in use).
> 
> I'm using the Intel Fortran compiler version 14.0.0.080. For reference,
> here's my setup line:
> 
> ./setup magnetoHD/SmithMotion -auto -3d -maxblocks=600 +usm
> nonIdealMHD=True withDarkMatter=False gravMode=1 withNEI=True
> -objdir=smithMotion_NoDM_nonIdealB_dbug -makefile=intel -debug
> 
> Note that I've defined the setup flags nonIdealMHD, withDarkMatter,
> gravMode, and withNEI in the Config file for my simulation.

These setup variables must be doing something specific to your simulation.
Hopefully you don't have code in SimulationMain/magnetoHD/SmithMotion
(or code changes elsewhere) that would interfere with the internals of the 
Hydro unit.

Klaus


More information about the flash-bugs mailing list