[FLASH-USERS] How to generate big checkpoint files in FLASH

Klaus Weide klaus at flash.uchicago.edu
Wed Mar 2 12:23:45 EST 2016


On Tue, 1 Mar 2016, Zheng Yuan wrote:

> Dear all,
> 
> My goal is to run FLASH using 1K MPI processes and get large checkpoint files
> (500 GB each. These large checkpoint files will be used as the test data of a
> parallel data compression algorithm proposed by our group.). Does anybody know
> how to set the parameters to run FLASH in large scale and get large checkpoint
> files?

Hello,

If you really "just" want large checkpoint files, you should start with a 3D 
setup!


> Currently, I am trying to run FLASH/Sedov to generate large checkpoint files.
> However, FLASH reports an error when I increase the dimension of the grid.
> 
> I configured FLASH using:
> 
> ./setup Sedov -auto +pnetcdf -objdir=sedov_2d -parfile=sedov_io_69b_2d.par -2d
> -nxb=64 -nyb=64 -maxblocks=200
> 
> To run more iterations, I changed the 'nend' parameter to 10 in
> sedov_2d/flash.par and make sedov_2d
> 
> I run flash using command:
> 
> mpirun -n 4 ./flash4
> 
> 
> FLASH terminate at the third iteration. The error is:

 *** Wrote plotfile to sedov_2d_6lev_ncmpi_plt_cnt_0001 ****
       n          t         dt  (         x,          y,          z) |  dt_hydro 
       1 2.0000E-10 2.5000E-05  ( 4.993E-01,  4.875E-01,  0.000E+00) |  1.265E-05
 *** Wrote checkpoint file to sedov_2d_6lev_ncmpi_chk_0001 ****
 *** Wrote plotfile to sedov_2d_6lev_ncmpi_plt_cnt_0002 ****
       2 5.0000E-05 2.5000E-05  ( 5.134E-01,  5.017E-01,  0.000E+00) |  8.673E-06
 *** Wrote checkpoint file to sedov_2d_6lev_ncmpi_chk_0002 ****
 *** Wrote plotfile to sedov_2d_6lev_ncmpi_plt_cnt_0003 ****
       3 1.0000E-04 2.5000E-05  ( 5.139E-01,  4.988E-01,  0.000E+00) |  6.407E-07
 *** Wrote checkpoint file to sedov_2d_6lev_ncmpi_chk_0003 ****
  
 Nonconvergence in subroutine rieman
 .....


By increasing NXB and NYB, you have increased the resolution. This
requires a smaller timestep for stability, see "dt_hydro" column.  But
your flash.par fixes dtmin and dtmax at 2.5e-5, the direct effect of
this can be seen in the "dt" column. So you are forcing FLASH to do
Hydro advances with a time step that is too large for stability.
So it is not surprising that the simulation fails after a few steps.

Basically, aking your runtime parameter "dtmin" much smaller should
get you a simulation that runs.

Btw, a more common way to increase resolution (thus file sizes) would
be to increase lrefine_in and/or lrefine_max and/or Nblock{X,Y,Z},
rather than keep increasing the block size.

Klaus



More information about the flash-users mailing list