[FLASH-USERS] Problems running the Sedov test case

Elliot Parkin phy1erp at leeds.ac.uk
Mon Feb 23 10:56:30 EST 2009


Dear flash-users,

I've been attempting to run the Sedov test case and I've been getting  
some errors when I try different things. I've listed the problems below:

1) When running the Sedov test case on 1 processor the run terminates  
with the following error:

  Driver_abort called. See log file for details.
  Error message is tmr_buildSummary: ran out of space building summary
  Calling MPI_Abort() for immediate shutdown!

... should the run terminate with an MPI_Abort? The run reaches the  
desired end-time, but I would have thought it would end neatly rather  
than hitting an abort.

2) When running the Sedov case with tmax changed to 1.0 (from 0.05) I  
get the following error:


HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.   
Back trace follows.
   #000: H5Dio.c line 587 in H5Dwrite(): can't write data
     major(15): Dataset interface
     minor(25): Write failed
   #001: H5Dio.c line 953 in H5D_write(): can't write data
     major(15): Dataset interface
     minor(25): Write failed
   #002: H5Dio.c line 1324 in H5D_contig_write(): optimized write failed
     major(15): Dataset interface
     minor(25): Write failed
   #003: H5Dselect.c line 654 in H5D_select_write(): write error
     major(14): Dataspace interface
     minor(25): Write failed
   #004: H5Dcontig.c line 739 in H5D_contig_writevv(): block write failed
     major(05): Low-level I/O layer
     minor(25): Write failed
   #005: H5F.c line 3338 in H5F_block_write(): file write failed
     major(05): Low-level I/O layer
     minor(25): Write failed
   #006: H5FD.c line 3486 in H5FD_write(): driver write request failed
     major(22): Virtual File Layer
     minor(25): Write failed
   #007: H5FDsec2.c line 804 in H5FD_sec2_write(): file write failed
     major(05): Low-level I/O layer
     minor(25): Write failed


... I suspect this is a problem with my version of HDF5 (I'm using  
HDF5-1.6.5)?

3) If I attempt to use 2 processors (i.e. mpiexec -n 2 ./flash3):

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.   
Back trace follows.
   #000: H5S.c line 1584 in H5Screate_simple(): zero sized dimension  
for non-unlimited dimension
     major(01): Function arguments
     minor(05): Bad value
HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.   
Back trace follows.
   #000: H5Dio.c line 561 in H5Dwrite(): not a data space
     major(01): Function arguments
     minor(03): Inappropriate type

... and this error repeats everytime that output of an HDF file is  
attempted. Again I suspect this may be my version of HDF5.

4) If I attempt to use 4 processors (i.e. mpiexec -n 4 ./flash3):


*** glibc detected *** ./flash3: double free or corruption (out):  
0x0000000019893fd0 ***
======= Backtrace: =========
*** glibc detected *** ./flash3: free(): invalid next size (normal):  
0x000000000d7f5d70 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3d9086f4f4]
/lib64/libc.so.6(cfree+0x8c)[0x3d90872b1c]
/usr/lib64/libgfortran.so.1(_gfortran_deallocate+0x26)[0x2aaaaadf32a6]
./flash3[0x478948]
./flash3[0x43d717]
./flash3[0x459035]
./flash3[0x4bacf4]
./flash3[0x4e1ef4]
./flash3[0x40fb33]
./flash3[0x41bfb1]
./flash3[0x47fb44]
./flash3[0x41b6ea]
./flash3[0x4088f3]
./flash3[0x40d5f9]
./flash3[0x5c5d4e]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3d9081d8a4]
./flash3[0x403899]
======= Memory map: ========
00400000-00609000 r-xp 00000000 09:02 26644999                          
   /import/ross_a/erp/FLASH3.1.1a/object/flash3
00809000-0080c000 rw-p 00209000 09:02 26644999                          
   /import/ross_a/erp/FLASH3.1.1a/object/flash3
0080c000-0350c000 rw-p 0080c000 00:00 0
0d7c2000-0d882000 rw-p 0d7c2000 00:00 0
3d90400000-3d9041a000 r-xp 00000000 09:00 1305626                       
   /lib64/ld-2.5.so
3d90619000-3d9061a000 r--p 00019000 09:00 1305626                       
   /lib64/ld-2.5.so
3d9061a000-3d9061b000 rw-p 0001a000 09:00 1305626                       
   /lib64/ld-2.5.so
3d90800000-3d90946000 r-xp 00000000 09:00 1305789                       
   /lib64/libc-2.5.so
3d90946000-3d90b46000 ---p 00146000 09:00 1305789                       
   /lib64/libc-2.5.so
3d90b46000-3d90b4a000 r--p 00146000 09:00 1305789                       
   /lib64/libc-2.5.so
3d90b4a000-3d90b4b000 rw-p 0014a000 09:00 1305789                       
   /lib64/libc-2.5.so
3d90b4b000-3d90b50000 rw-p 3d90b4b000 00:00 0
3d90c00000-3d90c82000 r-xp 00000000 09:00 1305792                       
   /lib64/libm-2.5.so
3d90c82000-3d90e81000 ---p 00082000 09:00 1305792                       
   /lib64/libm-2.5.so
3d90e81000-3d90e82000 r--p 00081000 09:00 1305792                       
   /lib64/libm-2.5.so
3d90e82000-3d90e83000 rw-p 00082000 09:00 1305792                       
   /lib64/libm-2.5.so
3d91400000-3d91415000 r-xp 00000000 09:00 1305796                       
   /lib64/libpthread-2.5.so
3d91415000-3d91614000 ---p 00015000 09:00 1305796                       
   /lib64/libpthread-2.5.so
3d91614000-3d91615000 r--p 00014000 09:00 1305796                       
   /lib64/libpthread-2.5.so
3d91615000-3d91616000 rw-p 00015000 09:00 1305796                       
   /lib64/libpthread-2.5.so
3d91616000-3d9161a000 rw-p 3d91616000 00:00 0
3d91800000-3d91814000 r-xp 00000000 09:00 2470772                       
   /usr/lib64/libz.so.1.2.3
3d91814000-3d91a13000 ---p 00014000 09:00 2470772                       
   /usr/lib64/libz.so.1.2.3
3d91a13000-3d91a14000 rw-p 00013000 09:00 /lib64/libc.so.6[0x3d9086f4f4]
rank 2 in job 35  phy-pc086.leeds.ac.uk_51446   caused collective  
abort of all ranks
   exit status of rank 2: killed by signal 9
2470772                        /usr/lib64/libz.so.1.2.3
3d91c00000-3d91c07000 r-xp 00000000 09:00 1305865                       
   /lib64/librt-2.5.so
3d91c07000-3d91e07000 ---p 00007000 09:00 1305865                       
   /lib64/librt-2.5.so
3d91e07000-3d91e08000 r--p 00007000 09:00 1305865                       
   /lib64/librt-2.5.so
3d91e08000-3d91e09000 rw-p 00008000 09:00 1305865                       
   /lib64/librt-2.5.so
3d9f200000-3d9f20d000 r-xp 00000000 09:00 1305794                       
   /lib64/libgcc_s-4.1.2-20070626.so.1
3d9f20d000-3d9f40d000 ---p 0000d000 09:00 1305794                       
   /lib64/libgcc_s-4.1.2-20070626.so.1
3d9f40d000-3d9f40e000 rw-p 0000d000 09:00 1305794                       
   /lib64/libgcc_s-4.1.2-20070626.so.1
2aaaaaaab000-2aaaaaaad000 rw-p 2aaaaaaab000 00:00 0
2aaaaaaad000-2aaaaabc4000 r-xp 00000000 09:02 26739102                  
   /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
2aaaaabc4000-2aaaaadc3000 ---p 00117000 09:02 26739102                  
   /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
2aaaaadc3000-2aaaaadca000 rw-p 00116000 09:02 26739102                  
   /import/ross_a/erp/hdf5-1.6.5/hdf5/lib/libhdf5.so.0.0.0
2aaaaade1000-2aaaaade2000 rw-p 2aaaaade1000 00:00 0
2aaaaade2000-2aaaaae78000 r-xp 00000000 09:00 2454495                   
   /usr/lib64/libgfortran.so.1.0.0
2aaaaae78000-2aaaab077000 ---p 00096000 09:00 2454495                   
   /usr/lib64/libgfortran.so.1.0.0
2aaaab077000-2aaaab079000 rw-p 00095000 09:00 2454495                   
   /usr/lib64/libgfortran.so.1.0.0
2aaaab079000-2aaaab07c000 rw-p 2aaaab079000 00:00 0
2aaaac000000-2aaaac021000 rw-p 2aaaac000000 00:00 0
2aaaac021000-2aaab0000000 ---p 2aaaac021000 00:00 0
7fff5629e000-7fff562b5000 rw-p 7fff5629e000 00:00 0                     
   [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0                 
   [vdso]




Sorry to chuck all these errors in one email.

Thanks in advance for any assistance,

Ross Parkin




More information about the flash-users mailing list