[FLASH-USERS] Parallel I/O on Stampede

Christopher Daley cdaley at flash.uchicago.edu
Wed May 8 11:58:41 EDT 2013


Hi Meghann,

The issue you report is because of an incompatibility between
different I/O implementations in FLASH.  Traditionally, a FLASH
variable, such as 'dens', is stored in a HDF5 dataset with a name that
is padded to 4 characters.  This means that a FLASH variable named 'h'
is actually stored in a HDF5 dataset named 'h   '.  The newer
+hdf5typeio does not pad variables and so 'h' is stored in a HDF5
dataset named 'h'.

You will only encounter this incompatibility if you use different
FLASH I/O implementations during the course of a FLASH run,
e.g. create a checkpoint file with the default I/O (+serialio) and
then restart from this checkpoint file using a FLASH application setup
with derived datatype parallel I/O (+hdf5typeio).

I've created a patch file against FLASH-4.0.1 release which will allow
you to switch between different I/O implementations.  There may be
issues applying this patch against older versions of FLASH.


To apply the patch file:

  * Download file fix_io_incompatibility.diff from
http://flash.uchicago.edu/~cdaley/FLASH4.0.1_patches/fix_io_incompatibility.diff

  * Change to the top level directory (i.e., the directory containing the
    source, sites, tools, etc. subdirectories) of your existing FLASH 4.0.1
    directory tree.  Move the patch file into this directory.

  * Run

     patch -p0 --dry-run < fix_io_incompatibility.diff

    to verify that the patch can be applied cleanly.

    If you get error messages, you probably are in the wrong directory, or
    your FLASH version is not 4.0.1, or you have made local changes that
    interfere with the patch. If it is not possible to fix the problem,
    revert to a clean version of FLASH-4.0.1 and try again.

  * Run

     patch -p0 < fix_io_incompatibility.diff

    to apply the patch.


Chris


On 04/28/2013 07:07 AM, Meghann Agarwal wrote:
> Hello,
>
> Has anyone been able to run FLASH4 on Stampede with parallel I/O?
>
> My code successfully compiles after setup with flag +hdf5typeio and 
> loading module phdf5.  I restart from a checkpoint file written in 
> serial, but it seems that not all grid variables/data from the input 
> file are transferred to memory.
>
> Thanks,
> Meghann
>
> Some screen output:
>
>  file: rad_hdf5_chk_0028 opened for restart
>  [io_h5_read_file_format.c]: File format version is 9.
>  [io_h5_xfer_mesh_dataset.c]: Couldn't find variable 'h' in file, so 
> skipping it.
>  [io_h5_xfer_mesh_dataset.c]: Couldn't find variable 'hd' in file, so 
> skipping it.
>  [io_h5_xfer_mesh_dataset.c]: Couldn't find variable 'hel' in file, so 
> skipping it.
>  [io_h5_xfer_mesh_dataset.c]: Couldn't find variable 'hep' in file, so 
> skipping it.
>  read_data:  read        59241  blocks.
>  io_readData:  finished reading input file.
> ...
> WARNING after gc filling: min. 
> unk(DENS_VAR)=0.5002375168268448933932E-40     PE=124 
> block=1                                type=1
> ...
>
>




More information about the flash-users mailing list