[FLASH-USERS] flash (sedov) runs with 4 processors, but no more ..

Anshu Dubey dubey at flash.uchicago.edu
Tue Jan 24 15:55:12 EST 2012


Could you send me your logfile and setup_units file from the object
directory please ?
I am assuming that you used the default flash.par.

Anshu

On Tue, Jan 24, 2012 at 2:49 PM, Diego <dlopezc at ncsu.edu> wrote:

> HI everybody, I am rather new to FLASH and have some bit of a problem when
> trying to run a simple test problem in many procesors.
>
> I hope somebody has come with this kind of problem and been able to solve
> it.
>
> So:
>
> Running the Sedov test problem, I am able to run it with 4 procesors at
> most (with either 4 processors on
> one blade, 2 processors on each of 2 blades, or 4 processors, 1 each on
> each of 4 blades). This runs beautifully for any refinement criteria,
> refinement levels, courant criteria, riemann solver, and the interpolation
> scheme.
>
> Then when I try to solve exactly the same test problem with more than 4
> procesors, for any blade-processor combination, the result is that I get an
> MPI abort
>
> For a failing job asking for 16 processors, the last lines of sedov.log are
>
> [ 01-24-2012  14:42:52.608 ] message: rss   (MB):        29.66
> (min)         29.96 (max)         29.78 (avg)
>  [ 01-24-2012  14:42:56.165 ] [IO_writeCheckpoint] open: type=checkpoint
> name=sedov_hdf5_chk_0000
>  [ 01-24-2012  14:42:56.285 ] [IO_writeCheckpoint] close: type=checkpoint
> name=sedov_hdf5_chk_0000
>  WARNING you have called IO_writePlotfile but no plot_vars are defined.
>  put the vars you want in the plotfile in your flash.par (plot_var_1 =
> "dens")
>  [ 01-24-2012  14:42:56.309 ] [IO_writePlotfile] open: type=plotfile
> name=sedov_hdf5_plt_cnt_0000
>  [ 01-24-2012  14:42:56.418 ] [IO_writePlotfile] close: type=plotfile
> name=sedov_hdf5_plt_cnt_0000
>  [ 01-24-2012  14:42:56.508 ] message: vsize (MB):        97.23
> (min)         98.40 (max)         97.40 (avg)
>  [ 01-24-2012  14:42:56.508 ] message: rss   (MB):        29.74
> (min)         32.81 (max)         30.03 (avg)
>  [ 01-24-2012  14:42:56.508 ] [Driver_evolveFlash]: Entering evolution loop
>  [ 01-24-2012  14:42:56.509 ] step: n=1 t=0.000000E+00 dt=1.000000E-10
>  [ 01-24-2012  14:42:56.739 ] [mpi_amr_comm_setup]: buffer_dim_send=7087,
> buffer_dim_recv=2865
>
> -------------------------------------------------------------
> It seems weird that the buffer_dim_send is much larger than the
> buffer_dim_recv ?
> (but can find some other examples that continue to run with the send
> buffer only slightly larger than
> the receive buffer).
>
> ---------------------------------------------------------------
> I also played with flash.par .. the lines
>
> iGridSize = 16   #global number of gridpoints along x, excluding gcells
> jGridSize = 16   #global number of gridpoints along y, excluding gcells
> kGridSize = 1
> iProcs = 4      #num procs in i direction
> jProcs = 4      #num procs in j direction
> kProcs = 1
>
> Originially the iGridSize, jGridsize, kGridSize were commented.  and the
> iProcs, jProcs were set to 1 (for these sedov runs ok with iProcs and
> jProcs set to 2 -- running on 4 processors, but not with 16 ).
>
> -----------------------------------------------------------------------
> In the compilation
>
> FDEFINES has -DNX=8 and -DNY=8  .. do not know how that is set ..
>
> From the sedov.log file the compile lines were  ( pgi10.5 and mpich2 with
> "hydra")
>
>  mpif90 -I/include -c -r8 -i4 -fastsse -Mnovect -pc 64 -DMAXBLOCKS=1000
> -DNXB=8 -DNYB=8 -DNZB=1 -DN_DIM=2
>  c compiler flags:
>  mpicc -I/usr/local/apps/hdf/64hydra-pgi105/5-1.8.5-patch1-i8/include
> -DH5_USE_16_API -I/include -c -O2 -DM
>
> Any help, comments, or useful information is very welcome.
>
> Thanks.
>
> Diego
>
>
>


-- 
**********************************************************************************************************
Anshu Dubey
Associate Director and CS/Applications Group Leader          5747 S. Ellis
Avenue 3rd Flr.
Flash Center for Computational Science                                773
834 2999 (office)
Fellow, Computation
Institute                                                  312 420 0033
(mobile)
University of Chicago and Argonne National Laboratory        773 834 3230
(fax)
**********************************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20120124/96efa0d2/attachment.htm>


More information about the flash-users mailing list