Subsections


9.9 Output Formats

HDF5 is our most most widely used IO library although Parallel-NetCDF is rapidly gaining acceptance among the high performance computing community. In FLASH4 we also offer a serial direct FORTRAN IO which is currently only implemented for the uniform grid. This option is intended to provide users a way to output data if they do not have access to HDF5 or PnetCDF. Additionally, if HDF5 or PnetCDF are not performing well on a given platform the direct IO implementation can be used as a last resort. Our tools, fidlr and sfocu (Prt:Tools), do not currently support the direct IO implementation, and the output files from this mode are not portable across platforms.


9.9.1 HDF5

HDF5 is supported on a large variety of platforms and offers large file support and parallel IO via MPI-IO. Information about the different versions of HDF can be found at https://support.hdfgroup.org/documentation/. The IO in FLASH4 implementations require HDF5 1.4.0 or later. Please note that HDF5 1.6.2 requires IDL 1.6 or higher in order to use fidlr3.0 for post processing.

Implementations of the HDF5 IO unit use the HDF application programming interface (API) for organizing data in a database fashion. In addition to the raw data, information about the data type and byte ordering (little- or big-endian), rank, and dimensions of the dataset is stored. This makes the HDF format extremely portable across platforms. Different packages can query the file for its contents without knowing the details of the routine that generated the data.

FLASH provides different HDF5 IO unit implementations - the serial and parallel versions for each supported grid, Uniform Grid and PARAMESH. It is important to remember to match the IO implementation with the correct grid, although the setup script generally takes care of this matching. PARAMESH 2, PARAMESH 4.0, and PARAMESH 4dev all work with the PARAMESH (PM) implementation of IO. Nonfixed blocksize IO has its own implementation in parallel, and is presently not supported in serial mode. Examples are given below for the five different HDF5 IO implementations.

 

./setup Sod -2d -auto -unit=IO/IOMain/hdf5/serial/PM (included by default)
./setup Sod -2d -auto -unit=IO/IOMain/hdf5/parallel/PM
./setup Sod -2d -auto -unit=Grid/GridMain/UG -unit=IO/IOMain/hdf5/serial/UG
./setup Sod -2d -auto -unit=Grid/GridMain/UG -unit=IO/IOMain/hdf5/parallel/UG
./setup Sod -2d -auto -nofbs -unit=Grid/GridMain/UG -unit=IO/IOMain/hdf5/parallel/NoFbs

The default IO implementation is IO/IOMain/hdf5/serial/PM. It can be included simply by adding -unit=IO to the setup line. In FLASH4, the user can set up shortcuts for various implementations. See Chp:The FLASH configuration script for more information about creating shortcuts.

The format of the HDF5 output files produced by these various IO implementations is identical; only the method by which they are written differs. It is possible to create a checkpoint file with the parallel routines and restart FLASH from that file using the serial routines or vice-versa. (This switch would require resetting up and compiling a code to get an executable with the serial version of IO.) When outputting with the Uniform Grid, some data is stored that isn't explicitly necessary for data analysis or visualization, but is retained to keep the output format of PARAMESH the same as with the Uniform Grid. See Sec:Data Format for more information on output data formats. For example, the refinement level in the Uniform Grid case is always equal to 1, as is the nodetype array. A tree structure for the Uniform Grid is `faked' for visualization purposes. In a similar way, the non-fixedblocksize mode outputs all of the data stored by the grid as though it is one large block. This allows restarting with differing numbers of processors and decomposing the domain in an arbitrary fashion in Uniform Grid.

Parallel HDF5 mode has two runtime parameters useful for debugging: chkGuardCellsInput and chkGuardCellsOutput. When these runtime parameters are true, the FLASH4 input and output routines read and/or output the guard cells in addition to the normal interior cells. Note that the HDF5 files produced are not compatible with the visualization and analysis tools provided with FLASH4.

FLASH3 Transition: We recommend that HDF5 version 1.6 or later be used with the HDF5 IO implementations with FLASH4. While it is possible to use any version of HDF5 1.4.0 or later, files produced with versions predating version 1.6 will not be compatible with code using the libraries post HDF5 1.6.

Caution: If you are using version HDF5 $ \ge$ 1.8 then you must explicitly use HDF5 1.6 API bindings. Either build HDF5 library with “-with-default-api-version=v16” configure option or compile FLASH with the C preprocessor definition H5_USE_16_API. Our preference is to set the CFLAGS_HDF5 Makefile.h variable, e.g., for compilation with most compilers:
CFLAGS_HDF5 = -I${HDF5_PATH}/include -DH5_USE_16_API


9.9.1.1 Collective Mode

By default, the parallel mode of HDF5 uses an independent access pattern for writing datasets and performs IO without aggregating the disk access for writing. Parallel HDF5 can also be run so that the writes to the file's datasets are aggregated, allowing the data from multiple processors to be written to disk in fewer operations. This can greatly increase the performance of IO on filesystems that support this behavior. FLASH4 can make use of this mode by setting the runtime parameter useCollectiveHDF5 to true.

9.9.1.2 Machine Compatibility

The HDF5 modules have been tested successfully on the ASC platforms and on a Linux clusters. Performance varies widely across the platforms, but the parallel version is usually faster than the serial version. Experience on performing parallel IO on a Linux Cluster using PVFS is reported in Ross et al. (2001). Note that for clusters without a parallel filesystem, you should not use the parallel HDF5 IO module with an NFS mounted filesystem. In this case, all of the information will still have to pass through the node from which the disk is hanging, resulting in contention. It is recommended that a serial version of the HDF5 unit be used instead.


9.9.1.3 HDF5 Data Format

The HDF5 data format for FLASH4 is identical to FLASH2 for all grid variables and datastructures used to recreate the tree and neighbor data with the exception that bounding box, coordinates, and block size are now sized as mdim, or the maximum dimensions supported by FLASH's grids, which is three, rather than ndim. PARAMESH 4.0 and PARAMESH 4dev, however, do requires a few additional tree data structures to be output which are described below. The format of the metadata stored in the HDF5 files has changed to reduce the number of `writes' required. Additionally, scalar data, like time, dt, nstep, etc., are now stored in a linked list and written all at one time. Any unit can add scalar data to the checkpoint file by calling the routine IO_setScalar. See Sec:output scalars for more details. The FLASH4 HDF5 format is summarized in Table 9.7.

Table 9.7: FLASH HDF5 file format.
Record label Description of the record
   
Simulation Meta Data: included in all files
   
sim info Stores simulation meta data in a user defined C structure. Structure datatype and attributes of the structure are described below.
 
typedef struct sim_info_t 
  int file_format_version;
  char setup_call[400];
  char file_creation_time[MAX_STRING_LENGTH];
  char flash_version[MAX_STRING_LENGTH];
  char build_date[MAX_STRING_LENGTH];
  char build_dir[MAX_STRING_LENGTH];
  char build_machine[MAX_STRING_LENGTH];
  char cflags[400];
  char fflags[400];
  char setup_time_stamp[MAX_STRING_LENGTH];
  char build_time_stamp[MAX_STRING_LENGTH];
 sim_info_t;
sim_info_t sim_info;
 
sim_info.file_format_version: An integer giving the version number of the HDF5 file format. This is incremented anytime changes are made to the layout of the file.
   
sim_info.setup_call: The complete syntax of the setup command used when creating the current FLASH executable.
   
sim_info.file_creation_time: The time and date that the file was created.
   
sim_info.flash_version: The version of FLASH used for the current simulation. This is returned by routine setup_flashVersion.
   
sim_info.build_date: The date and time that the FLASH executable was compiled.
   
sim_info.build_dir: The complete path to the FLASH root directory of the source tree used when compiling the FLASH executable. This is generated by the subroutine setup_buildstats which is created at compile time by the Makefile.
   
sim_info.build_machine: The name of the machine (and anything else returned from uname -a) on which FLASH was compiled.
   
sim_info.cflags: The c compiler flags used in the given simulation. The routine setup_buildstats is written by the setup script at compile time and also includes the fflags below.
   
sim_info.fflags: The f compiler flags used in the given simulation.
   
sim_info.setup_time_stamp: The date and time the given simulation was setup. The routine setup_buildstamp is created by the setup script at compile time.
   
sim_info.build_time_stamp: The date and time the given simulation was built. The routine setup_buildstamp is created by the setup script at compile time.
   
RuntimeParameter and Scalar data
Data are stored in linked lists with the nodes of each entry for each type listed below.
 
typedef struct int_list_t 
  char name[MAX_STRING_LENGTH];
  int value;
 int_list_t;
typedef struct real_list_t char name[MAX_STRING_LENGTH]; double value; real_list_t;
typedef struct str_list_t char name[MAX_STRING_LENGTH]; char value[MAX_STRING_LENGTH]; str_list_t;
typedef struct log_list_t char name[MAX_STRING_LENGTH]; int value; log_list_t;
int_list_t *int_list; real_list_t *real_list; str_list_t *str_list; log_list_t *log_list;
 
integer runtime parameters int_list_t int_list(numIntParams)
  A linked list holding the names and values of all the integer runtime parameters.
   
real runtime parameters real_list_t real_list(numRealParams)
  A linked list holding the names and values of all the real runtime parameters.
   
string runtime parameters str_list_t str_list(numStrParams)
  A linked list holding the names and values of all the string runtime parameters.
   
logical runtime parameters log_list_t log_list(numLogParams)
  A linked list holding the names and values of all the logical runtime parameters.
   
integer scalars int_list_t int_list(numIntScalars)
  A linked list holding the names and values of all the integer scalars.
   
real scalars real_list_t real_list(numRealScalars)
  A linked list holding the names and values of all the real scalars.
   
string scalars str_list_t str_list(numStrScalars)
  A linked list holding the names and values of all the string scalars.
   
logical scalars log_list_t log_list(numLogScalars)
  A linked list holding the names and values of all the logical scalars.
   
Grid data: included only in checkpoint files and plotfiles
unknown names character*4 unk_names(nvar)
  This array contains four-character names corresponding to the first index of the unk array. They serve to identify the variables stored in the `unknowns' records.
   
refine level integer lrefine(globalNumBlocks)
  This array stores the refinement level for each block.
   
node type integer nodetype(globalNumBlocks)
  This array stores the node type for a block. Blocks with node type 1 are leaf nodes, and their data will always be valid. The leaf blocks contain the data which is to be used for plotting purposes.
   
gid integer gid(nfaces+1+nchild,globalNumBlocks)
  This is the global identification array. For a given block, this array gives the block number of the blocks that neighbor it and the block numbers of its parent and children.
   
coordinates real coord(mdim,globalNumBlocks)
  This array stores the coordinates of the center of the block.
      coord(1,blockID) = x-coordinate
      coord(2,blockID) = y-coordinate
       coord(3,blockID) = z-coordinate
   
block size real size(mdim,globalNumBlocks)
  This array stores the dimensions of the current block.
      size(1,blockID) = x size
      size(2,blockID) = y size
      size(3,blockID) = z size
bounding box real bnd_box(2,mdim,globalNumBlocks)
  This array stores the minimum (bnd_box(1,:,:)) and maximum (bnd_box(2,:,:)) coordinate of a block in each spatial direction.
which child (Paramesh4.0 and Paramesh4dev only!) integer which_child(globalNumBlocks)
  An integer array identifying which part of the parents' volume this child corresponds to.
variable real unk(nxb,nyb,nzb,globalNumBlocks)
      nx = number of cells/block in x
      ny = number of cells/block in y
      nz = number of cells/block in z
  This array holds the data for a single variable. The record label is identical to the four-character variable name stored in the record unknown names. Note that, for a plot file with CORNERS=.true. in the parameter file, the information is interpolated to the cell corners and stored.
   
   
Particle Data: included in checkpoint files and particle files
localnp integer localnp(globalNumBlocks)
  This array holds the number of particles on each processor.
   
particle names character*24 particle_labels(NPART_PROPS)
  This array contains twenty four-character names corresponding to the attributes in the particles array. They serve to identify the variables stored in the 'particles' record.
   
tracer particles real particles(NPART_PROPS, globalNumParticles
  Real array holding the particles data structure. The first dimension holds the various particle properties like, velocity, tag etc. The second dimension is sized as the total number of particles in the simulation. Note that all the particle properties are real values.
   

 

9.9.1.4 Split File IO

On machines with large numbers of processors, IO may perform better if, all processors write to a limited number of separate files rather than one single file. This technique can help mitigate IO bottlenecks and contention issues on these large machines better than even parallel-mode IO can. In addition this technique has the benefit of keeping the number of output files much lower than if every processor writes its own file. Split file IO can be enabled by setting the outputSplitNum parameter to the number of files desired (i.e. if outputSplitNum is set to 4, every checkpoint, plotfile and particle file will be broken into 4 files, by processor number). This feature is only available with the HDF5 parallel IO mode, and is still experimental. Users should use this at their own risk.


9.9.2 Parallel-NetCDF

Another implementation of the IO unit uses the Parallel-NetCDF library available at
http://www.mcs.anl.gov/parallel-netcdf/. At this time, the FLASH code requires version 1.1.0 or higher. Our testing shows performance of PNetCDF library to be very similar to HDF5 library when using collective I/O optimizations in parallel I/O mode.

There are two different PnetCDF IO unit implementations. Both are parallel implementations, one for each supported grid, the Uniform Grid and PARAMESH. It is important to remember to match the IO implementation with the correct grid. To include PnetCDF IO in a simulation the user should add -unit=IO/IOMain/pnetcdf..... to the setup line. See examples below for the two different PnetCDF IO implementations.

 

./setup Sod -2d -auto -unit=IO/IOMain/pnetcdf/PM
./setup Sod -2d -auto -unit=Grid/GridMain/UG -unit=IO/IOMain/pnetcdf/UG

The paths to these IO implementations can be long and tedious to type, users are advised to set up shortcuts for various implementations. See Chp:The FLASH configuration script for information about creating shortcuts.

To the end-user, the PnetCDF data format is very similar to the HDF5 format. (Under the hood the data storage is quite different.) In HDF5 there are datasets and dataspaces, in PnetCDF there are dimensions and variables. All the same data is stored in the PnetCDF checkpoint as in the HDF5 checkpoint file, although there are some differences in how the data is stored. The grid data is stored in multidimensional arrays, as it is in HDF5. These are unknown names, refine level, node type, gid, coordinates, proc number, block size and bounding box. The particles data structure is also stored in the same way. The simulation metadata, like file format version, file creation time, setup command line, etc., are stored as global attributes. The runtime parameters and the output scalars are also stored as attributes. The unk and particle labels are also stored as global attributes. In PnetCDF, all global quantities must be consistent across all processors involved in a write to a file, or else the write will fail. All IO calls are run in a collective mode in PnetCDF.

9.9.3 Direct IO

As mentioned above, the direct IO implementation has been added so users can always output data even if the HDF5 or pnetCDF libraries are unavailable. The user should examine the two helper routines io_writeData and io_readData. Copy the base implementation to a simulation directory, and modify them in order to write out specifically what is needed. To include the direct IO implementation add the following to your setup line:  
-unit=IO/IOMain/direct/UG or -unit=IO/IOMain/direct/PM

9.9.4 Output Side Effects

In FLASH4 when plotfiles or checkpoint files are output by IO_output, the grid is fully restricted and user variables are computed prior to writing the file. IO_writeCheckpoint and IO_writePlotfile by default, do not do this step themselves. The restriction can be forced for all writes by setting runtime parameter alwaysRestrictCheckpoint to true and the user variables can always be computed prior to output by setting alwaysComputeUserVars to true.