Could you send me your logfile and setup_units file from the object directory please ?<br>I am assuming that you used the default flash.par.<br><br>Anshu<br><br><div class="gmail_quote">On Tue, Jan 24, 2012 at 2:49 PM, Diego <span dir="ltr"><<a href="mailto:dlopezc@ncsu.edu">dlopezc@ncsu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">HI everybody, I am rather new to FLASH and have some bit of a problem when trying to run a simple test problem in many procesors.<br>
<br>I hope somebody has come with this kind of problem and been able to solve it.<br><br>
So:<br><br>Running the Sedov test problem, I am able to run it with 4 procesors at most (with either 4 processors on <br>one blade, 2 processors on each of 2 blades, or 4 processors, 1 each on each of 4 blades). This runs beautifully for any refinement criteria, refinement levels, courant criteria, riemann solver, and the interpolation scheme. <br>
<br>Then when I try to solve exactly the same test problem with more than 4 procesors, for any blade-processor combination, the result is that I get an MPI abort<br><br>For a failing job asking for 16 processors, the last lines of sedov.log are<br>
<br>[ 01-24-2012 14:42:52.608 ] message: rss (MB): 29.66 (min) 29.96 (max) 29.78 (avg)<br> [ 01-24-2012 14:42:56.165 ] [IO_writeCheckpoint] open: type=checkpoint name=sedov_hdf5_chk_0000<br> [ 01-24-2012 14:42:56.285 ] [IO_writeCheckpoint] close: type=checkpoint name=sedov_hdf5_chk_0000<br>
WARNING you have called IO_writePlotfile but no plot_vars are defined.<br> put the vars you want in the plotfile in your flash.par (plot_var_1 = "dens")<br> [ 01-24-2012 14:42:56.309 ] [IO_writePlotfile] open: type=plotfile name=sedov_hdf5_plt_cnt_0000<br>
[ 01-24-2012 14:42:56.418 ] [IO_writePlotfile] close: type=plotfile name=sedov_hdf5_plt_cnt_0000<br> [ 01-24-2012 14:42:56.508 ] message: vsize (MB): 97.23 (min) 98.40 (max) 97.40 (avg)<br> [ 01-24-2012 14:42:56.508 ] message: rss (MB): 29.74 (min) 32.81 (max) 30.03 (avg)<br>
[ 01-24-2012 14:42:56.508 ] [Driver_evolveFlash]: Entering evolution loop<br> [ 01-24-2012 14:42:56.509 ] step: n=1 t=0.000000E+00 dt=1.000000E-10<br> [ 01-24-2012 14:42:56.739 ] [mpi_amr_comm_setup]: buffer_dim_send=7087, buffer_dim_recv=2865<br>
<br>-------------------------------------------------------------<br>It seems weird that the buffer_dim_send is much larger than the buffer_dim_recv ? <br>(but can find some other examples that continue to run with the send buffer only slightly larger than <br>
the receive buffer). <br><br>---------------------------------------------------------------<br>
I also played with flash.par .. the lines <br><br>iGridSize = 16 #global number of gridpoints along x, excluding gcells<br>jGridSize = 16 #global number of gridpoints along y, excluding gcells<br>kGridSize = 1<br>iProcs = 4 #num procs in i direction<br>
jProcs = 4 #num procs in j direction<br>kProcs = 1<br><br>Originially the iGridSize, jGridsize, kGridSize were commented. and the iProcs, jProcs were set to 1 (for these sedov runs ok with iProcs and jProcs set to 2 -- running on 4 processors, but not with 16 ).<br>
<br>-----------------------------------------------------------------------<br>
In the compilation<br><br>FDEFINES has -DNX=8 and -DNY=8 .. do not know how that is set .. <br>
<br>From the sedov.log file the compile lines were ( pgi10.5 and mpich2 with "hydra") <br><br> mpif90 -I/include -c -r8 -i4 -fastsse -Mnovect -pc 64 -DMAXBLOCKS=1000 -DNXB=8 -DNYB=8 -DNZB=1 -DN_DIM=2<br> c compiler flags:<br>
mpicc -I/usr/local/apps/hdf/64hydra-pgi105/5-1.8.5-patch1-i8/include -DH5_USE_16_API -I/include -c -O2 -DM<br><br>Any help, comments, or useful information is very welcome.<br><br>Thanks.<span class="HOEnZb"><font color="#888888"><br>
<br>Diego<br><br><br>
</font></span></blockquote></div><br><br clear="all"><br>-- <br>**********************************************************************************************************<br>Anshu Dubey<br>Associate Director and CS/Applications Group Leader 5747 S. Ellis Avenue 3rd Flr.<br>
Flash Center for Computational Science 773 834 2999 (office)<br>Fellow, Computation Institute 312 420 0033 (mobile)<br>University of Chicago and Argonne National Laboratory 773 834 3230 (fax)<br>
**********************************************************************************************************<br><br>