[FLASH-USERS] Runtime error when trying to test Sod or Sedov

Chris Daley cdaley at flash.uchicago.edu
Wed Mar 16 17:01:26 EDT 2011


Hi Irakli,

I recommend checking that the FFLAGS macros in your Makefile.h promote
reals to double precision.  After that I recommend checking the
consistency of your software stack:

1). Are you using the same MPI installation to build FLASH (mpif90) as
you are to launch FLASH (mpirun)?
2). Are you building FLASH with the same compiler that was used to
build HDF5?
3). If you have a shared library installation of HDF5 you should
check that the expected version of HDF5 is loaded (see output of ldd
./flash3).

For testing purposes you can build FLASH without I/O using the +noio
setup shortcut.  This may help you to isolate the problem.

You mention that you have run FLASH under valgrind and that you see
memory leaks.  For the purposes of finding the actual error I would
ignore the leaks and focus on "invalid read" and "invalid write"
valgrind messages that indicate memory problems.  For usage with
valgrind, I recommend FLASH applications with pm4dev (+pm4dev setup
shortcut) because the output is less noisy than FLASH applications
with pm40 (default). I spent some time last year removing false
positives, but only in FLASH applications with pm4dev mesh package.

If you are still having trouble you should send us your setup line and
Makefile.h.

Chris


Garishvili, Irakli wrote:
> Dear Flash Experts,
> 
> I am trying to test newest FLASH (3.3) version on a “chaos_4_x86_64” 
> linux machine.
> 
> I separately tried to built/run Sod and Sedov problems and I got pretty 
> much the same errors in both
> cases. Setup and build (gmake) processes give me absolutely no trouble. 
>  However, I get “hdf5 I/O”-related runtime errors
>  after executing ‘mpirun –np 1 flash3’ command.
> 
> The strange thing is that runtime error messages totally change if I 
> modify “lrefine_max” (default was set to 6) value inside
>  flash.par file. Below I copied last parts of 2 different error 
> printouts I get when running Sod problem.
> First when lrefine_max = 6,  and second when lrefine_max = 3.
> 
> I also tested flash3 executable using valgrind which seemed to indicate 
> some memory leak issues.
> 
> I’d really appreciate any suggestion on this matter.
> Cheers,
> Irakli.  
> 
> 
> 1.)
> 
>       84 1.9390E-01 1.5864E-03 |  1.58640E-03
>       85 1.9707E-01 1.5865E-03 |  1.58646E-03
>   iteration, no. not moved =            0           0
>  refined: total leaf blocks =          541
>  refined: total blocks =          721
>       86 2.0024E-01 1.5865E-03 |  1.58647E-03
> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back 
> trace follows.
>   #000: H5Dio.c line 587 in H5Dwrite(): can't write data
>     major(15): Dataset interface
>     minor(25): Write failed
>   #001: H5Dio.c line 953 in H5D_write(): can't write data
>     major(15): Dataset interface
>     minor(25): Write failed
>   #002: H5Dio.c line 1324 in H5D_contig_write(): optimized write failed
>     major(15): Dataset interface
>     minor(25): Write failed
>   #003: H5Dselect.c line 654 in H5D_select_write(): write error
>     major(14): Dataspace interface
>     minor(25): Write failed
>   #004: H5Dcontig.c line 739 in H5D_contig_writevv(): block write failed
>     major(05): Low-level I/O layer
>     minor(25): Write failed
>   #005: H5F.c line 3338 in H5F_block_write(): file write failed
>     major(05): Low-level I/O layer
>     minor(25): Write failed
>   #006: H5FD.c line 3486 in H5FD_write(): driver write request failed
>     major(22): Virtual File Layer
>     minor(25): Write failed
>   #007: H5FDsec2.c line 804 in H5FD_sec2_write(): file write failed
>     major(05): Low-level I/O layer
>     minor(25): Write failed
> Driver_abortC called
> Error: Unable to write unknowns
> 
> Calling MPI_Abort for immediate shutdown
> [0] [MPI Abort by user] Aborting Program!
> mpirun_rsh: Abort signaled from [1]
> 
> 
> 2.)
> 
>       24 3.3554E-03 1.6777E-03 |  1.92661E-02
>       25 6.7109E-03 3.3554E-03 |  1.73809E-02
>       26 1.3422E-02 6.7109E-03 |  1.52721E-02
>       27 2.6844E-02 1.3422E-02 |  1.36666E-02
>       28 5.3687E-02 1.2506E-02 |  1.25062E-02
>       29 7.8699E-02 1.2473E-02 |  1.24730E-02
>       30 1.0365E-01 1.2544E-02 |  1.25439E-02
>       31 1.2873E-01 1.2581E-02 |  1.25813E-02
>       32 1.5390E-01 1.2615E-02 |  1.26146E-02
>       33 1.7913E-01 1.2648E-02 |  1.26484E-02
>       34 2.0442E-01 1.2663E-02 |  1.26628E-02
>  *** Wrote checkpoint file to sod_hdf5_chk_0001 ****
>  exiting: reached max SimTime
>  *** Wrote plotfile to sod_forced_hdf5_plt_cnt_0000 ****
> Segmentation fault (core dumped)
> 
> ------ End of Forwarded Message




More information about the flash-users mailing list