[FLASH-USERS] Problems running the Sedov test case

Elliot Parkin phy1erp at leeds.ac.uk
Mon Feb 23 12:04:44 EST 2009


Hello again,

The first problem doesn't appear to be related to the number of  
refinement levels as I still get the same error when I set lrefine_max  
= 3.

Thanks,

Ross



Quoting Anshu Dubey <dubey at flash.uchicago.edu>:

> Are you running the problem in 3D ? If so, and you haven't reduced the
> lrefine_max,
> you might be running out of memory. All of the aborts in FLASH call
> MPI_Abort, so
> that is the expected behavior.
> Anshu
>
> On Mon, Feb 23, 2009 at 9:56 AM, Elliot Parkin <phy1erp at leeds.ac.uk> wrote:
>> Dear flash-users,
>>
>> I've been attempting to run the Sedov test case and I've been getting some
>> errors when I try different things. I've listed the problems below:
>>
>> 1) When running the Sedov test case on 1 processor the run terminates with
>> the following error:
>>
>>  Driver_abort called. See log file for details.
>>  Error message is tmr_buildSummary: ran out of space building summary
>>  Calling MPI_Abort() for immediate shutdown!
>>
>> ... should the run terminate with an MPI_Abort? The run reaches the desired
>> end-time, but I would have thought it would end neatly rather than hitting
>> an abort.
>>
>> 2) When running the Sedov case with tmax changed to 1.0 (from 0.05) I get
>> the following error:
>>
>>
>> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back
>> trace follows.
>>  #000: H5Dio.c line 587 in H5Dwrite(): can't write data
>>    major(15): Dataset interface
>>    minor(25): Write failed
>>  #001: H5Dio.c line 953 in H5D_write(): can't write data
>>    major(15): Dataset interface
>>    minor(25): Write failed
>>  #002: H5Dio.c line 1324 in H5D_contig_write(): optimized write failed
>>    major(15): Dataset interface
>>    minor(25): Write failed
>>  #003: H5Dselect.c line 654 in H5D_select_write(): write error
>>    major(14): Dataspace interface
>>    minor(25): Write failed
>>  #004: H5Dcontig.c line 739 in H5D_contig_writevv(): block write failed
>>    major(05): Low-level I/O layer
>>    minor(25): Write failed
>>  #005: H5F.c line 3338 in H5F_block_write(): file write failed
>>    major(05): Low-level I/O layer
>>    minor(25): Write failed
>>  #006: H5FD.c line 3486 in H5FD_write(): driver write request failed
>>    major(22): Virtual File Layer
>>    minor(25): Write failed
>>  #007: H5FDsec2.c line 804 in H5FD_sec2_write(): file write failed
>>    major(05): Low-level I/O layer
>>    minor(25): Write failed
>>
>>
>> ... I suspect this is a problem with my version of HDF5 (I'm using
>> HDF5-1.6.5)?
>>
>> 3) If I attempt to use 2 processors (i.e. mpiexec -n 2 ./flash3):
>>
>> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back
>> trace follows.
>>  #000: H5S.c line 1584 in H5Screate_simple(): zero sized dimension for
>> non-unlimited dimension
>>    major(01): Function arguments
>>    minor(05): Bad value
>> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  Back
>> trace follows.
>>  #000: H5Dio.c line 561 in H5Dwrite(): not a data space
>>    major(01): Function arguments
>>    minor(03): Inappropriate type
>>
>> ... and this error repeats everytime that output of an HDF file is
>> attempted. Again I suspect this may be my version of HDF5.
>>
>> 4) If I attempt to use 4 processors (i.e. mpiexec -n 4 ./flash3):
>>
>>
>> *** glibc detected *** ./flash3: double free or corruption (out):
>> 0x0000000019893fd0 ***
>> ======= Backtrace: =========
>> *** glibc detected *** ./flash3: free(): invalid next size (normal):
>> 0x000000000d7f5d70 ***
>> ======= Backtrace: =========
>> /lib64/libc.so.6[0x3d9086f4f4]
>> /lib64/libc.so.6(cfree+0x8c)[0x3d90872b1c]
>> /usr/lib64/libgfortran.so.1(_gfortran_deallocate+0x26)[0x2aaaaadf32a6]
>> ./flash3[0x478948]
>> ./flash3[0x43d717]
>> ./flash3[0x459035]
>> ./flash3[0x4bacf4]
>> ./flash3[0x4e1ef4]
>> ./flash3[0x40fb33]
>> ./flash3[0x41bfb1]
>> ./flash3[0x47fb44]
>> ./flash3[0x41b6ea]
>> ./flash3[0x4088f3]
>> ./flash3[0x40d5f9]
>> ./flash3[0x5c5d4e]
>> /lib64/libc.so.6(__libc_start_main+0xf4)[0x3d9081d8a4]
>> ./flash3[0x403899]
>> ======= Memory map: ========
>> 00400000-00609000 r-xp 00000000 09:02 26644999
>>  /import/ross_a/erp/FLASH3.1.1a/object/flash3
>> 00809000-0080c000 rw-p 00209000 09:02 26644999
>>  /import/ross_a/erp/FLASH3.1.1a/object/flash3
>> 0080c000-0350c000 rw-p 0080c000 00:00 0
>> 0d7c2000-0d882000 rw-p 0d7c2000 00:00 0
>> 3d90400000-3d9041a000 r-xp 00000000 09:00 1305626
>>  /lib64/ld-2.5.so
>> 3d90619000-3d9061a000 r--p 00019000 09:00 1305626
>>  /lib64/ld-2.5.so
>> 3d9061a000-3d9061b000 rw-p 0001a000 09:00 1305626
>>  /lib64/ld-2.5.so
>> 3d90800000-3d90946000 r-xp 00000000 09:00 1305789
>>  /lib64/libc-2.5.so
>> 3d90946000-3d90b46000 ---p 00146000 09:00 1305789
>>  /lib64/libc-2.5.so
>> 3d90b46000-3d90b4a000 r--p 00146000 09:00 1305789
>>  /lib64/libc-2.5.so
>> 3d90b4a000-3d90b4b000 rw-p 0014a000 09:00 1305789
>>  /lib64/libc-2.5.so
>> 3d90b4b000-3d90b50000 rw-p 3d90b4b000 00:00 0
>> 3d90c00000-3d90c82000 r-xp 00000000 09:00 1305792
>>  /lib64/libm-2.5.so
>> 3d90c82000-3d90e81000 ---p 00082000 09:00 1305792
>>  /lib64/libm-2.5.so
>> 3d90e81000-3d90e82000 r--p 00081000 09:00 1305792
>>  /lib64/libm-2.5.so
>> 3d90e82000-3d90e83000 rw-p 00082000 09:00 1305792
>>  /lib64/libm-2.5.so
>> 3d91400000-3d91415000 r-xp 00000000 09:00 1305796
>>  /lib64/libpthread-2.5.so
>> 3d91415000-3d91614000 ---p 00015000 09:00 1305796
>>  /lib64/libpthread-2.5.so
>> 3d91614000-3d91615000 r--p 00014000 09:00 1305796
>>  /lib64/libpthread-2.5.so
>> 3d91615000-3d91616000 rw-p 00015000 09:00 1305796
>>  /lib64/libpthread-2.5.so
>> 3d91616000-3d9161a000 rw-p 3d91616000 00:00 0
>> 3d91800000-3d91814000 r-xp 00000000 09:00 2470772
>>  /usr/lib64/libz.so.1.2.3
>> 3d91814000-3d91a13000 ---p 00014000 09:00 2470772
>>  /usr/lib64/libz.so.1.2.3
>> 3d91a13000-3d91a14000 rw-p 00013000 09:00 /lib64/libc.so.6[0x3d9086f4f4]
>> rank 2 in job 35  phy-pc086.leeds.ac.uk_51446   caused collective abort of
>> all ranks
>>  exit status of rank 2: killed by signal 9
>> 2470772                        /usr/lib64/libz.so.1.2.3
>> 3d91c00000-3d91c07000 r-xp 00000000 09:00 1305865
>>  /lib64/librt-2.5.so
>> 3d91c07000-3d91e07000 ---p 00007000 09:00 1305865
>>  /lib64/librt-2.5.so
>> 3d91e07000-3d91e08000 r--p 00007000 09:00 1305865
>>  /lib64/librt-2.5.so
>> 3d91e08000-3d91e09000 rw-p 00008000 09:00 1305865
>>  /lib64/librt-2.5.so
>> 3d9f200000-3d9f20d000 r-xp 00000000 09:00 1305794
>>  /lib64/libgcc_s-4.1.2-20070626.so.1
>> 3d9f20d000-3d9f40d000 ---p 0000d000 09:00 1305794
>>  /lib64/libgcc_s-4.1.2-20070626.so.1
>> 3d9f40d000-3d9f40e000 rw-p 0000d000 09:00 1305794
>>  /lib64/libgcc_s-4.1.2-20070626.so.1
>> 2aaaaaaab000-2aaaaaaad000 rw-p 2aaaaaaab000 00:00 0
>> 2aaaaaaad000-2aaaaabc4000 r-xp 00000000 09:02 26739102
>>  /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
>> 2aaaaabc4000-2aaaaadc3000 ---p 00117000 09:02 26739102
>>  /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
>> 2aaaaadc3000-2aaaaadca000 rw-p 00116000 09:02 26739102
>>  /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
>> 2aaaaade1000-2aaaaade2000 rw-p 2aaaaade1000 00:00 0
>> 2aaaaade2000-2aaaaae78000 r-xp 00000000 09:00 2454495
>>  /usr/lib64/libgfortran.so.1.0.0
>> 2aaaaae78000-2aaaab077000 ---p 00096000 09:00 2454495
>>  /usr/lib64/libgfortran.so.1.0.0
>> 2aaaab077000-2aaaab079000 rw-p 00095000 09:00 2454495
>>  /usr/lib64/libgfortran.so.1.0.0
>> 2aaaab079000-2aaaab07c000 rw-p 2aaaab079000 00:00 0
>> 2aaaac000000-2aaaac021000 rw-p 2aaaac000000 00:00 0
>> 2aaaac021000-2aaab0000000 ---p 2aaaac021000 00:00 0
>> 7fff5629e000-7fff562b5000 rw-p 7fff5629e000 00:00 0
>>  [stack]
>> ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0
>>  [vdso]
>>
>>
>>
>>
>> Sorry to chuck all these errors in one email.
>>
>> Thanks in advance for any assistance,
>>
>> Ross Parkin
>>
>>
>





More information about the flash-users mailing list