[FLASH-USERS] Problems running the Sedov test case

Anshu Dubey dubey at flash.uchicago.edu
Mon Feb 23 11:06:43 EST 2009


Are you running the problem in 3D ? If so, and you haven't reduced the
lrefine_max,
you might be running out of memory. All of the aborts in FLASH call
MPI_Abort, so
that is the expected behavior.
Anshu

On Mon, Feb 23, 2009 at 9:56 AM, Elliot Parkin <phy1erp at leeds.ac.uk> wrote:
> Dear flash-users,
>
> I've been attempting to run the Sedov test case and I've been getting some
> errors when I try different things. I've listed the problems below:
>
> 1) When running the Sedov test case on 1 processor the run terminates with
> the following error:
>
>  Driver_abort called. See log file for details.
>  Error message is tmr_buildSummary: ran out of space building summary
>  Calling MPI_Abort() for immediate shutdown!
>
> ... should the run terminate with an MPI_Abort? The run reaches the desired
> end-time, but I would have thought it would end neatly rather than hitting
> an abort.
>
> 2) When running the Sedov case with tmax changed to 1.0 (from 0.05) I get
> the following error:
>
>
> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back
> trace follows.
>  #000: H5Dio.c line 587 in H5Dwrite(): can't write data
>    major(15): Dataset interface
>    minor(25): Write failed
>  #001: H5Dio.c line 953 in H5D_write(): can't write data
>    major(15): Dataset interface
>    minor(25): Write failed
>  #002: H5Dio.c line 1324 in H5D_contig_write(): optimized write failed
>    major(15): Dataset interface
>    minor(25): Write failed
>  #003: H5Dselect.c line 654 in H5D_select_write(): write error
>    major(14): Dataspace interface
>    minor(25): Write failed
>  #004: H5Dcontig.c line 739 in H5D_contig_writevv(): block write failed
>    major(05): Low-level I/O layer
>    minor(25): Write failed
>  #005: H5F.c line 3338 in H5F_block_write(): file write failed
>    major(05): Low-level I/O layer
>    minor(25): Write failed
>  #006: H5FD.c line 3486 in H5FD_write(): driver write request failed
>    major(22): Virtual File Layer
>    minor(25): Write failed
>  #007: H5FDsec2.c line 804 in H5FD_sec2_write(): file write failed
>    major(05): Low-level I/O layer
>    minor(25): Write failed
>
>
> ... I suspect this is a problem with my version of HDF5 (I'm using
> HDF5-1.6.5)?
>
> 3) If I attempt to use 2 processors (i.e. mpiexec -n 2 ./flash3):
>
> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back
> trace follows.
>  #000: H5S.c line 1584 in H5Screate_simple(): zero sized dimension for
> non-unlimited dimension
>    major(01): Function arguments
>    minor(05): Bad value
> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back
> trace follows.
>  #000: H5Dio.c line 561 in H5Dwrite(): not a data space
>    major(01): Function arguments
>    minor(03): Inappropriate type
>
> ... and this error repeats everytime that output of an HDF file is
> attempted. Again I suspect this may be my version of HDF5.
>
> 4) If I attempt to use 4 processors (i.e. mpiexec -n 4 ./flash3):
>
>
> *** glibc detected *** ./flash3: double free or corruption (out):
> 0x0000000019893fd0 ***
> ======= Backtrace: =========
> *** glibc detected *** ./flash3: free(): invalid next size (normal):
> 0x000000000d7f5d70 ***
> ======= Backtrace: =========
> /lib64/libc.so.6[0x3d9086f4f4]
> /lib64/libc.so.6(cfree+0x8c)[0x3d90872b1c]
> /usr/lib64/libgfortran.so.1(_gfortran_deallocate+0x26)[0x2aaaaadf32a6]
> ./flash3[0x478948]
> ./flash3[0x43d717]
> ./flash3[0x459035]
> ./flash3[0x4bacf4]
> ./flash3[0x4e1ef4]
> ./flash3[0x40fb33]
> ./flash3[0x41bfb1]
> ./flash3[0x47fb44]
> ./flash3[0x41b6ea]
> ./flash3[0x4088f3]
> ./flash3[0x40d5f9]
> ./flash3[0x5c5d4e]
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x3d9081d8a4]
> ./flash3[0x403899]
> ======= Memory map: ========
> 00400000-00609000 r-xp 00000000 09:02 26644999
>  /import/ross_a/erp/FLASH3.1.1a/object/flash3
> 00809000-0080c000 rw-p 00209000 09:02 26644999
>  /import/ross_a/erp/FLASH3.1.1a/object/flash3
> 0080c000-0350c000 rw-p 0080c000 00:00 0
> 0d7c2000-0d882000 rw-p 0d7c2000 00:00 0
> 3d90400000-3d9041a000 r-xp 00000000 09:00 1305626
>  /lib64/ld-2.5.so
> 3d90619000-3d9061a000 r--p 00019000 09:00 1305626
>  /lib64/ld-2.5.so
> 3d9061a000-3d9061b000 rw-p 0001a000 09:00 1305626
>  /lib64/ld-2.5.so
> 3d90800000-3d90946000 r-xp 00000000 09:00 1305789
>  /lib64/libc-2.5.so
> 3d90946000-3d90b46000 ---p 00146000 09:00 1305789
>  /lib64/libc-2.5.so
> 3d90b46000-3d90b4a000 r--p 00146000 09:00 1305789
>  /lib64/libc-2.5.so
> 3d90b4a000-3d90b4b000 rw-p 0014a000 09:00 1305789
>  /lib64/libc-2.5.so
> 3d90b4b000-3d90b50000 rw-p 3d90b4b000 00:00 0
> 3d90c00000-3d90c82000 r-xp 00000000 09:00 1305792
>  /lib64/libm-2.5.so
> 3d90c82000-3d90e81000 ---p 00082000 09:00 1305792
>  /lib64/libm-2.5.so
> 3d90e81000-3d90e82000 r--p 00081000 09:00 1305792
>  /lib64/libm-2.5.so
> 3d90e82000-3d90e83000 rw-p 00082000 09:00 1305792
>  /lib64/libm-2.5.so
> 3d91400000-3d91415000 r-xp 00000000 09:00 1305796
>  /lib64/libpthread-2.5.so
> 3d91415000-3d91614000 ---p 00015000 09:00 1305796
>  /lib64/libpthread-2.5.so
> 3d91614000-3d91615000 r--p 00014000 09:00 1305796
>  /lib64/libpthread-2.5.so
> 3d91615000-3d91616000 rw-p 00015000 09:00 1305796
>  /lib64/libpthread-2.5.so
> 3d91616000-3d9161a000 rw-p 3d91616000 00:00 0
> 3d91800000-3d91814000 r-xp 00000000 09:00 2470772
>  /usr/lib64/libz.so.1.2.3
> 3d91814000-3d91a13000 ---p 00014000 09:00 2470772
>  /usr/lib64/libz.so.1.2.3
> 3d91a13000-3d91a14000 rw-p 00013000 09:00 /lib64/libc.so.6[0x3d9086f4f4]
> rank 2 in job 35  phy-pc086.leeds.ac.uk_51446   caused collective abort of
> all ranks
>  exit status of rank 2: killed by signal 9
> 2470772                        /usr/lib64/libz.so.1.2.3
> 3d91c00000-3d91c07000 r-xp 00000000 09:00 1305865
>  /lib64/librt-2.5.so
> 3d91c07000-3d91e07000 ---p 00007000 09:00 1305865
>  /lib64/librt-2.5.so
> 3d91e07000-3d91e08000 r--p 00007000 09:00 1305865
>  /lib64/librt-2.5.so
> 3d91e08000-3d91e09000 rw-p 00008000 09:00 1305865
>  /lib64/librt-2.5.so
> 3d9f200000-3d9f20d000 r-xp 00000000 09:00 1305794
>  /lib64/libgcc_s-4.1.2-20070626.so.1
> 3d9f20d000-3d9f40d000 ---p 0000d000 09:00 1305794
>  /lib64/libgcc_s-4.1.2-20070626.so.1
> 3d9f40d000-3d9f40e000 rw-p 0000d000 09:00 1305794
>  /lib64/libgcc_s-4.1.2-20070626.so.1
> 2aaaaaaab000-2aaaaaaad000 rw-p 2aaaaaaab000 00:00 0
> 2aaaaaaad000-2aaaaabc4000 r-xp 00000000 09:02 26739102
>  /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
> 2aaaaabc4000-2aaaaadc3000 ---p 00117000 09:02 26739102
>  /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
> 2aaaaadc3000-2aaaaadca000 rw-p 00116000 09:02 26739102
>  /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
> 2aaaaade1000-2aaaaade2000 rw-p 2aaaaade1000 00:00 0
> 2aaaaade2000-2aaaaae78000 r-xp 00000000 09:00 2454495
>  /usr/lib64/libgfortran.so.1.0.0
> 2aaaaae78000-2aaaab077000 ---p 00096000 09:00 2454495
>  /usr/lib64/libgfortran.so.1.0.0
> 2aaaab077000-2aaaab079000 rw-p 00095000 09:00 2454495
>  /usr/lib64/libgfortran.so.1.0.0
> 2aaaab079000-2aaaab07c000 rw-p 2aaaab079000 00:00 0
> 2aaaac000000-2aaaac021000 rw-p 2aaaac000000 00:00 0
> 2aaaac021000-2aaab0000000 ---p 2aaaac021000 00:00 0
> 7fff5629e000-7fff562b5000 rw-p 7fff5629e000 00:00 0
>  [stack]
> ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0
>  [vdso]
>
>
>
>
> Sorry to chuck all these errors in one email.
>
> Thanks in advance for any assistance,
>
> Ross Parkin
>
>



More information about the flash-users mailing list