[FLASH-USERS] Problems running the Sedov test case
Elliot Parkin
phy1erp at leeds.ac.uk
Mon Feb 23 10:56:30 EST 2009
Dear flash-users,
I've been attempting to run the Sedov test case and I've been getting
some errors when I try different things. I've listed the problems below:
1) When running the Sedov test case on 1 processor the run terminates
with the following error:
Driver_abort called. See log file for details.
Error message is tmr_buildSummary: ran out of space building summary
Calling MPI_Abort() for immediate shutdown!
... should the run terminate with an MPI_Abort? The run reaches the
desired end-time, but I would have thought it would end neatly rather
than hitting an abort.
2) When running the Sedov case with tmax changed to 1.0 (from 0.05) I
get the following error:
HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.
Back trace follows.
#000: H5Dio.c line 587 in H5Dwrite(): can't write data
major(15): Dataset interface
minor(25): Write failed
#001: H5Dio.c line 953 in H5D_write(): can't write data
major(15): Dataset interface
minor(25): Write failed
#002: H5Dio.c line 1324 in H5D_contig_write(): optimized write failed
major(15): Dataset interface
minor(25): Write failed
#003: H5Dselect.c line 654 in H5D_select_write(): write error
major(14): Dataspace interface
minor(25): Write failed
#004: H5Dcontig.c line 739 in H5D_contig_writevv(): block write failed
major(05): Low-level I/O layer
minor(25): Write failed
#005: H5F.c line 3338 in H5F_block_write(): file write failed
major(05): Low-level I/O layer
minor(25): Write failed
#006: H5FD.c line 3486 in H5FD_write(): driver write request failed
major(22): Virtual File Layer
minor(25): Write failed
#007: H5FDsec2.c line 804 in H5FD_sec2_write(): file write failed
major(05): Low-level I/O layer
minor(25): Write failed
... I suspect this is a problem with my version of HDF5 (I'm using
HDF5-1.6.5)?
3) If I attempt to use 2 processors (i.e. mpiexec -n 2 ./flash3):
HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.
Back trace follows.
#000: H5S.c line 1584 in H5Screate_simple(): zero sized dimension
for non-unlimited dimension
major(01): Function arguments
minor(05): Bad value
HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.
Back trace follows.
#000: H5Dio.c line 561 in H5Dwrite(): not a data space
major(01): Function arguments
minor(03): Inappropriate type
... and this error repeats everytime that output of an HDF file is
attempted. Again I suspect this may be my version of HDF5.
4) If I attempt to use 4 processors (i.e. mpiexec -n 4 ./flash3):
*** glibc detected *** ./flash3: double free or corruption (out):
0x0000000019893fd0 ***
======= Backtrace: =========
*** glibc detected *** ./flash3: free(): invalid next size (normal):
0x000000000d7f5d70 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3d9086f4f4]
/lib64/libc.so.6(cfree+0x8c)[0x3d90872b1c]
/usr/lib64/libgfortran.so.1(_gfortran_deallocate+0x26)[0x2aaaaadf32a6]
./flash3[0x478948]
./flash3[0x43d717]
./flash3[0x459035]
./flash3[0x4bacf4]
./flash3[0x4e1ef4]
./flash3[0x40fb33]
./flash3[0x41bfb1]
./flash3[0x47fb44]
./flash3[0x41b6ea]
./flash3[0x4088f3]
./flash3[0x40d5f9]
./flash3[0x5c5d4e]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3d9081d8a4]
./flash3[0x403899]
======= Memory map: ========
00400000-00609000 r-xp 00000000 09:02 26644999
/import/ross_a/erp/FLASH3.1.1a/object/flash3
00809000-0080c000 rw-p 00209000 09:02 26644999
/import/ross_a/erp/FLASH3.1.1a/object/flash3
0080c000-0350c000 rw-p 0080c000 00:00 0
0d7c2000-0d882000 rw-p 0d7c2000 00:00 0
3d90400000-3d9041a000 r-xp 00000000 09:00 1305626
/lib64/ld-2.5.so
3d90619000-3d9061a000 r--p 00019000 09:00 1305626
/lib64/ld-2.5.so
3d9061a000-3d9061b000 rw-p 0001a000 09:00 1305626
/lib64/ld-2.5.so
3d90800000-3d90946000 r-xp 00000000 09:00 1305789
/lib64/libc-2.5.so
3d90946000-3d90b46000 ---p 00146000 09:00 1305789
/lib64/libc-2.5.so
3d90b46000-3d90b4a000 r--p 00146000 09:00 1305789
/lib64/libc-2.5.so
3d90b4a000-3d90b4b000 rw-p 0014a000 09:00 1305789
/lib64/libc-2.5.so
3d90b4b000-3d90b50000 rw-p 3d90b4b000 00:00 0
3d90c00000-3d90c82000 r-xp 00000000 09:00 1305792
/lib64/libm-2.5.so
3d90c82000-3d90e81000 ---p 00082000 09:00 1305792
/lib64/libm-2.5.so
3d90e81000-3d90e82000 r--p 00081000 09:00 1305792
/lib64/libm-2.5.so
3d90e82000-3d90e83000 rw-p 00082000 09:00 1305792
/lib64/libm-2.5.so
3d91400000-3d91415000 r-xp 00000000 09:00 1305796
/lib64/libpthread-2.5.so
3d91415000-3d91614000 ---p 00015000 09:00 1305796
/lib64/libpthread-2.5.so
3d91614000-3d91615000 r--p 00014000 09:00 1305796
/lib64/libpthread-2.5.so
3d91615000-3d91616000 rw-p 00015000 09:00 1305796
/lib64/libpthread-2.5.so
3d91616000-3d9161a000 rw-p 3d91616000 00:00 0
3d91800000-3d91814000 r-xp 00000000 09:00 2470772
/usr/lib64/libz.so.1.2.3
3d91814000-3d91a13000 ---p 00014000 09:00 2470772
/usr/lib64/libz.so.1.2.3
3d91a13000-3d91a14000 rw-p 00013000 09:00 /lib64/libc.so.6[0x3d9086f4f4]
rank 2 in job 35 phy-pc086.leeds.ac.uk_51446 caused collective
abort of all ranks
exit status of rank 2: killed by signal 9
2470772 /usr/lib64/libz.so.1.2.3
3d91c00000-3d91c07000 r-xp 00000000 09:00 1305865
/lib64/librt-2.5.so
3d91c07000-3d91e07000 ---p 00007000 09:00 1305865
/lib64/librt-2.5.so
3d91e07000-3d91e08000 r--p 00007000 09:00 1305865
/lib64/librt-2.5.so
3d91e08000-3d91e09000 rw-p 00008000 09:00 1305865
/lib64/librt-2.5.so
3d9f200000-3d9f20d000 r-xp 00000000 09:00 1305794
/lib64/libgcc_s-4.1.2-20070626.so.1
3d9f20d000-3d9f40d000 ---p 0000d000 09:00 1305794
/lib64/libgcc_s-4.1.2-20070626.so.1
3d9f40d000-3d9f40e000 rw-p 0000d000 09:00 1305794
/lib64/libgcc_s-4.1.2-20070626.so.1
2aaaaaaab000-2aaaaaaad000 rw-p 2aaaaaaab000 00:00 0
2aaaaaaad000-2aaaaabc4000 r-xp 00000000 09:02 26739102
/import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
2aaaaabc4000-2aaaaadc3000 ---p 00117000 09:02 26739102
/import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
2aaaaadc3000-2aaaaadca000 rw-p 00116000 09:02 26739102
/import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
2aaaaade1000-2aaaaade2000 rw-p 2aaaaade1000 00:00 0
2aaaaade2000-2aaaaae78000 r-xp 00000000 09:00 2454495
/usr/lib64/libgfortran.so.1.0.0
2aaaaae78000-2aaaab077000 ---p 00096000 09:00 2454495
/usr/lib64/libgfortran.so.1.0.0
2aaaab077000-2aaaab079000 rw-p 00095000 09:00 2454495
/usr/lib64/libgfortran.so.1.0.0
2aaaab079000-2aaaab07c000 rw-p 2aaaab079000 00:00 0
2aaaac000000-2aaaac021000 rw-p 2aaaac000000 00:00 0
2aaaac021000-2aaab0000000 ---p 2aaaac021000 00:00 0
7fff5629e000-7fff562b5000 rw-p 7fff5629e000 00:00 0
[stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0
[vdso]
Sorry to chuck all these errors in one email.
Thanks in advance for any assistance,
Ross Parkin
More information about the flash-users
mailing list