[FLASH-USERS] restarting FLASH

Josef Stöckl josef.stoeckl at uibk.ac.at
Mon Aug 25 18:43:39 EDT 2008


Hi Mateusz,

There is a distinct restart bug in the serial HDF5 IO unit, which I decribed
a few months back in the flash-bugs mailing list. Basically in a 3D problem
the wrong buffer variable gets used (most likely due to copy-and-paste). I
also posted a fix, which consists of modifying one line in the file
IO/IOMain/hdf5/serial/PM/io_readData.F90:

     if(NDIM .gt. 2) then

           allocate(faceZBuf(NUNK_VARS, NXB, NYB, NZB+1, localNumBlocks))
-          call MPI_RECV(unk(i,:,:,:,1:localNumBlocks), &
+          call MPI_RECV(faceZBuf(i,:,:,:,1:localNumBlocks), &
                NXB*NYB*(NZB+1)*localNumBlocks, &
                FLASH_REAL, MASTER_PE, &
        9+i+NUNK_VARS+(NFACE_VARS*2), &
                MPI_COMM_WORLD, status, ierr)
           facevarz(i,io_ilo:io_ihi, io_jlo:io_jhi, 
io_klo:io_khi+1,1:localNumBlocks) = &
           faceZBuf(i,1:NXB,1:NYB,1:NZB+1,1:localNumBlocks)
       deallocate(faceZBuf)

     end if

(this is a unified patch-like description)

I hope this helps you!

Best regards,
Josef


-----Ursprüngliche Nachricht-----
Von: flash-users-bounces at flash.uchicago.edu
[mailto:flash-users-bounces at flash.uchicago.edu] Im Auftrag von
mateuszr at umich.edu
Gesendet: Samstag, 23. August 2008 20:53
An: flash-users at flash.uchicago.edu
Betreff: [FLASH-USERS] restarting FLASH


   Hi all,

I am having trouble restarting the code. More specifically, I am  
getting a segmentation fault when I try to restart from a valid  
checkpoint file. The details are enclosed below. I would be vary  
grateful for some clues.

   thanks,
     Mateusz


[mateuszr at galaxy ~/FLASH]$ more out.txt

  [io_readData] Opening test_hdf5_chk_0011 for restart
rank 1 in job 2  galaxy001.astro.lsa.umich.edu_38345   caused  
collective abort of all ranks
   exit status of rank 1: killed by signal 9

[mateuszr at galaxy ~/FLASH]$ more flash_run.err
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
flash3             000000000050CC48  Unknown               Unknown  Unknown
flash3             000000000043EEBA  Unknown               Unknown  Unknown
flash3             000000000043C359  Unknown               Unknown  Unknown
flash3             0000000000410513  Unknown               Unknown  Unknown
flash3             0000000000415C19  Unknown               Unknown  Unknown
flash3             0000000000406982  Unknown               Unknown  Unknown
libc.so.6          00000035AFE1D8A4  Unknown               Unknown  Unknown
flash3             00000000004068A9  Unknown               Unknown  Unknown
touch: cannot touch `/scratch///.keep': Permission denied







More information about the flash-users mailing list