<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Hi Marco, <br>
<br>
I think the error is still happening because maxblocks_alloc is
still<br>
too small. Please experiment with increasing the value even more.<br>
The default values in a 3D FLASH application are maxblocks=200 and<br>
maxblocks_alloc=2000 (maxblocks_alloc=maxblocks*10). You have<br>
maxblocks_alloc=80. It is perfectly fine to reduce the value of<br>
maxblocks (and thus maxblocks_alloc), but there comes a point when
the<br>
buffers are too small for Paramesh to operate.<br>
<br>
<br>
I've copied this email to the flash-users mailing list so that
others<br>
can see our complete email exchange which includes debugging<br>
segmentation faults and running FLASH on BG/Q.<br>
<br>
For your other questions<br>
<br>
(i) The BG/Q error you show seems to be related to your runtime<br>
environment and not FLASH. We use cobalt on Mira BG/Q so your job<br>
submission script is unfamiliar to me, however, it looks like
bg_size<br>
in your script specifies the number of nodes you want. If so, it<br>
should be set to 32 (and not 64) to give 1024 total MPI ranks and
32<br>
ranks per node.<br>
<br>
(ii) See the first paragraph of this email.<br>
<br>
(iii) Never use direct I/O. You should be able to get the
performance<br>
you need from the FLASH parallel I/O implementations. Please see
"A<br>
case study for scientific I/O: improving the FLASH astrophysics
code"<br>
(<a class="moz-txt-link-freetext" href="http://iopscience.iop.org/1749-4699/5/1/015001">http://iopscience.iop.org/1749-4699/5/1/015001</a>) for a discussion
of<br>
FLASH parallel I/O and usage of collective optimizations.<br>
<br>
(iv) The advantage of PFFT over FFTW is that PFFT was written by
Anshu<br>
so we have in-house knowledge of how it works. I am unaware of
any<br>
performance comparisons between PFFT and FFTW. <br>
<br>
It is probably possible to integrate FFTW in FLASH. We have
mapping<br>
code inside the PFFT unit which is general enough to take FLASH
mesh<br>
data and create a slab decomposition for FFTW (where each MPI rank
has<br>
a complete 2D slab) instead of a pencil decomposition for PFFT
(where<br>
each MPI rank has a complete 1D pencil).<br>
<br>
(v) I don't know. Maybe someone else can help with this.<br>
<br>
Chris<br>
<br>
<br>
On 01/29/2013 09:07 AM, Marco Mazzuoli wrote:<br>
</div>
<blockquote cite="mid:DUB108-W227B0E5E2F9E30F618DFCFAB1F0@phx.gbl"
type="cite">
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style>
<div dir="ltr">
Dear Christopher,<br>
<br>
obviously you did agree. The error was due to memory corruption
caused by a conflict between "qrealsize=8" and some parameter I
defined as real(kind(1D0)) instead of real(kind(1.2E0)). Now
Flash4 (version FLASH4-alpha_release) runs. Also the
IO/IOMain/hdf5/parallel/PM works.<br>
<br>
I tried to implement the last version FLASH4 available on the
website, but I met some troubles both in compiletime (in the
interface of some Grid subroutines), which I have fixed, and in
runtime. As for the runtime, I would ask you a pair of
questions.<br>
i) What does the following stdoutput, visualized just before the
simulation to abort, mean?<br>
<br>
--------------------------------------------------------------------------------------------------------------------<br>
[Driver_initParallel]: Called MPI_Init_thread - requested
level 2, given level 2<br>
flash4: binding.c:290: _xlsmp_get_cpu_topology_lop: Assertion
`thr < ncpus' failed.<br>
--------------------------------------------------------------------------------------------------------------------<br>
<br>
Please you can find attached here (jobANDmake.tar) the input
file ("job.cmd") and the makefile with the compilation flags i
put into "sites" ("Makefile.h").<br>
Essentially I use a bgsize=64 (i.e. 1024 cores), with 32 ranks
per node, on the BG/Q machine. Block size= 16x16x16,
maxblocks=40, Max refinement level =6 (everywhere).<br>
<br>
ii) Notwithstanding I turned both maxblocks_tr (in tree.F90) and
maxblocks_alloc (in paramesh_dimensions.F90) into 20*maxblocks,
the following error holds when I use larger block dimensions
(32x32x32) and maxblocks=8 (max refinement level=5):<br>
<br>
--------------------------------------------------------------------------------------------------------------------<br>
...<br>
...<br>
ERROR in process_fetch_list : guard block starting index -15
not larger than lnblocks 5 processor no. 40 maxblocks_alloc
80<br>
ERROR in process_fetch_list : guard block starting index -3
not larger than lnblocks 5 processor no. 442 maxblocks_alloc
80<br>
ERROR in process_fetch_list : guard block starting index -11
not larger than lnblocks 5 processor no. 92 maxblocks_alloc
80<br>
ERROR in process_fetch_list : guard block starting index -11
not larger than lnblocks 5 processor no. 804 maxblocks_alloc
80<br>
ERROR in process_fetch_list : guard block starting index -3
not larger than lnblocks 5 processor no. 218 maxblocks_alloc
80<br>
Abort(0) on node 396 (rank 396 in comm -2080374784): application
called MPI_Abort(comm=0x84000000, 0) - process 396<br>
...<br>
...<br>
--------------------------------------------------------------------------------------------------------------------<br>
Why do you think it still occurs?<br>
<br>
iii) Do you know what is the speed up obtained by using
IO/IOMain/direct/PM? Except the huge number of files, is there
any other counter-indication?<br>
<br>
iv) From your experience could you explain me if it exists and,
eventually, which is the advantage to use the PFFT instead of an
other parallel fft library e.g. fftw3?<br>
v) The direct solver implemented into the multigrid Poisson
solver is based on the Ricker's algorithm (2008) and, for a
rectangular domain with Dirichlet boundary conditions, it makes
use of the Integrated Green's Function technique. The
transform-space Green's function reads:<br>
<br>
G_{ijk}^l=-16\pi [ \Delta_{x_l}^{-2}\sin(i\pi/(2n_x)) +
\Delta_{y_l}^{-2}\sin(j\pi/(2n_y)) +
\Delta_{z_l}^{-2}\sin(k\pi/(2n_z)) ]^{-1} \qquad (*)<br>
<br>
Could you suggest me how to obtain (*) in order me to be able to
compute a similar Green's function which solves a
non-homogeneous Helmholtz problem with Dirichlet boundary
condition.<br>
<br>
It is clear that questions iv and v are out of the ordinary
support, so I ask you whether you can give me some hints.<br>
<br>
Thank you so much for your suggestions, Christopher. They have
been valuable.<br>
<br>
Sincerely,<br>
<br>
Marco<br>
<pre><font style="font-size:10pt" color="#002060" size="2">
Ing. Marco Mazzuoli</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">Dipartimento di Ingegneria</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">delle Costruzioni, dell'Ambiente e</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">del Territorio (DICAT)</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">via Montallegro 1</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">16145 GENOVA-ITALY</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">tel. +39 010 353 2497</font><font style="font-size:10pt" color="#002060" size="2">
cell. +39 338 7142904
</font><font style="font-size:10pt" color="#002060" size="2">e-mail <a class="moz-txt-link-abbreviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a></font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2"> <a class="moz-txt-link-abbreviated" href="mailto:marco.mazzuoli84@gmail.com">marco.mazzuoli84@gmail.com</a>
</font><font style="font-size:10pt" size="2"><img moz-do-not-send="true" alt=""></font>
</pre>
<br>
<br>
<div>
<hr id="stopSpelling">Date: Mon, 28 Jan 2013 11:13:04 -0600<br>
From: <a class="moz-txt-link-abbreviated" href="mailto:cdaley@flash.uchicago.edu">cdaley@flash.uchicago.edu</a><br>
To: <a class="moz-txt-link-abbreviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a><br>
Subject: Re: [FLASH-USERS] ERROR in mpi_morton_bnd_prol<br>
<br>
<div class="ecxmoz-cite-prefix">Hi Marco,<br>
<br>
I've had a quick glance at the minGridSpacing subroutine and
it<br>
looks OK. Nonsensical results like this are normally an<br>
indication of earlier memory corruption. One other thing
that is<br>
possible is that the sizeof dxMin is different to dxMinTmp.
If<br>
dxMin is declared with an explicit "kind=" then it may not
be the<br>
same size as that specified by MPI_DOUBLE_PRECISION.<br>
<br>
Since you already have "MaxRef" I think you can simplify
your<br>
subroutine to just<br>
<br>
use Grid_data, ONLY : gr_delta<br>
dxMin = gr_delta(1:MDIM,MaxRef)<br>
<br>
<br>
You attached a serial I/O FLASH subroutine and not parallel
I/O.<br>
Again I suspect earlier memory corruption is causing a
problem.<br>
<br>
Try the following unit test on BG/Q. It uses very similar
units<br>
to you. If it works then it is a strong indication that
your<br>
custom code has a bug which is causing memory corruption.<br>
<br>
./setup unitTest/PFFT_PoissonFD -auto -3d -maxblocks=200
+pm4dev -parfile=test_paramesh_3d_64cube.par<br>
<br>
Try it with 1 node and 16 MPI ranks. Then add 1 to both<br>
lrefine_min and lrefine_max and run it again with 8 nodes
and 128<br>
MPI ranks. Repeat as you wish and go as big as you like.
You<br>
should find that this unit test works without problem and
the<br>
Linf value should keep getting less as you add refinement
levels.<br>
<br>
<br>
I would remove the qsmp=omp:noauto<br>
-L/opt/ibmcmp/xlsmp/bg/3.1/lib64/ -lxlsmp flags because you
are<br>
not using OpenMP. In general, be careful with the -qsmp
flag as<br>
by default it adds -qhot aggressive compilation option.<br>
<br>
It is only in the last few months that our target
application for<br>
Early Science on Mira BG/Q can use the aggr
essive compilation<br>
flag -qhot without generating bad answers. Maybe it is
still<br>
causing problems in your application.<br>
<br>
Chris<br>
<br>
<br>
<br>
<br>
On 01/28/2013 08:04 AM, Marco Mazzuoli wrote:<br>
</div>
<blockquote
cite="mid:DUB108-W366344B68A27ECC78D6E5EAB180@phx.gbl">
<style><!--
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Tahoma;}
--></style>
<div dir="ltr"> Dear Christopher,<br>
<br>
I tried the code both by using serial hdf5 writing and
parallel hdf5 writing.<br>
<br>
i) As for the former case, I found that, by implementing
the code into the BG/Q machine, an error occurs in the
following (homemade) subroutine which saves the minimal
grid spacing for each Cartesian direction in the array
dXmin [real,dimension(3)]:<br>
-----
----------------------------------------------------------------------------<br>
SUBROUTINE minGridSpacing()<br>
<br>
USE Grid_data, ONLY: gr_meshComm<br>
USE Dns_data, ONLY: dXmin, MaxRef<br>
use Grid_interface, ONLY: Grid_getListOfBlocks,
Grid_getDeltas<br>
use tree, ONLY: lrefine<br>
<br>
IMPLICIT NONE<br>
<br>
#include "constants.h"<br>
#include "Flash.h"<br>
#include "Flash_mpi.h"<br>
<br>
INTEGER,DIMENSION(MAXBLOCKS) :: blockList<br>
INTEGER &n
bsp; :: blockCount<br>
INTEGER :: lb, ierr<br>
<br>
REAL,DIMENSION(MDIM) :: dXminTMP<br>
<br>
CALL
Grid_getListOfBlocks(ALL_BLKS,blockList,blockCount)<br>
<br>
! Initialization of dXminTMP<br>
dXminTMP(:)=9999.<br>
<br>
DO lb = 1, blockCount<br>
<br>
IF (lrefine(lb) .EQ. MaxRef) THEN<br>
CALL Grid_getDeltas(blockList(lb),dXminTMP)<br>
IF (ANY(dXminTMP.GE
.0.)) EXIT<br>
END IF<br>
<br>
END DO<br>
! ****** PRINT 1 ******<br>
PRINT*,dXminTMP(1),dXminTMP(2),dXminTMP(3),MaxRef<br>
! *********************<br>
! CALL MPI_Barrier (gr_meshComm, ierr)<br>
<br>
! find the smallest grid spacing for each direction
among all the blocks:<br>
CALL MPI_ALLREDUCE(dXminTMP, dXmin, 3,
MPI_DOUBLE_PRECISION, &<br>
MPI_MIN, gr_meshComm, ierr)<br>
<br>
! ****** PRINT 2 ******<br>
PRINT*,dXmin(1),dXmin(2),dXmin(3)<br>
! *********************<br>
<br>
IF(ierr.NE.0)PRINT*,"minGridSpacing(): MPI error!"<br>
<br>
END SUBROUTINE minGridSpacing<br>
---------------------------------------------------------------------------------<br>
<br>
The stdoutput of the PRINTs are:<br>
<br>
0.970785156250000003E-01 0.112096874999999999
0.112096874999999999 6<br>
0.20917539062499999891198143586734659
0.11209687499999999860111898897230276
0.00000000000000000000000000000000000E+00<br>
<br>
(each one repeated 1024 times)<br>
I do not know why the variables dXminTMP and dXmin contain
different values (none of the stdoutput of dXminTMP is
equal to dXmin).<br>
Could you suggest me why? Is it regular that the output
format is so different? Maybe some of the following flags
I impose to the fortran compiler (mpixlf90_r) is wrong
?<br>
<br>
FFLAGS_OPT = -g -O3 -qstrict -qsimd -qunroll=yes
-qarch=qp -qtune=qp -q64 -qrealsize=8 -qthreaded -qnosave
-qfixed -c \<br>
-qlistopt -qreport -WF,-DIBM
-qsmp=omp:noauto -L/opt/ibmcmp/xlsmp/bg/3.1/lib64/ -lxlsmp<br>
<br>
ii) As for the latter case (parallel hdf5 writing), I
detected the error I met last week (see previous
conversation below). It is between line 660 and line 705
of "io_writeData.F90" which please you can find attached
here. Do you know why the "segmentation fault" may occur
here?<br>
<br>
Thank you again for your help, Christopher.<br>
<br>
Sincerely,<br>
<br>
Marco<br>
<pre><font style="font-size:10pt" color="#002060" size="2">
Ing. Marco Mazzuoli</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">Dipartimento di Ingegneria</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">delle Costruzioni, dell'Ambiente e</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">del Territorio (DICAT)</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">via Montallegro 1</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">16145 GENOVA-ITALY</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">tel. +39 010 353 2497</font><font style="font-size:10pt" color="#002060" size="2">
cell. +39 338 7142904
</font><font style="font-size:10pt" color="#002060" size="2">e-mail <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a></font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2"> <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli84@gmail.com">marco.mazzuoli84@gmail.com</a>
</font><font style="font-size:10pt" size="2"><img moz-do-not-send="true" alt=""></font>
</pre>
<br>
<br>
<div>
<hr id="ecxstopSpelling">Date: Thu, 24 Jan 2013 10:19:34
-0600<br>
From: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbreviated"
href="mailto:cdaley@flash.uchicago.edu">cdaley@flash.uchicago.edu</a><br>
To: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbreviated"
href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a><br>
Subject: Re: [FLASH-USERS] ERROR in mpi_morton_bnd_prol<br>
<br>
<div class="ecxmoz-cite-prefix">Unfortunately, none of
the units that I have multithreaded are in<br>
your simulation.<br>
<b> If your problem fits in 512MB memory, I recommend
you run FLASH<br>
with 32 MPI ranks per node on BG/Q. If I recall
correctly, the<br>
PPC core can execute 2 instructions per clock cycle,
but the<br>
instructions must be issued by different hardware
threads.<br>
Placing multiple MPI ranks per core achieves this
aim and allows<br>
us to hide memory latency. Have a look at<br>
<a moz-do-not-send="true"
class="ecxmoz-txt-link-freetext"
href="http://flash.uchicago.edu/%7Ecdaley/Siam/Daley_MS30-3.pdf"
target="_blank">http://flash.uchicago.edu/~cdaley/Siam/Daley_MS30-3.pdf</a>
on page<br>
12. By eye, a 16 MPI ranks per node run with 1
thread took ~275<br>
seconds and a 32 MPI ranks per node run with 1
thread took ~190<br>
seconds.<br>
<br>
You should also setup FLASH with +pm4dev. This is
the latest<br>
Paramesh with Flash Center enhancements to make it
scale better.<br>
You should also use the latest FLASH release.<br>
<br>
In terms of debugging, you really need to look at
core file 948<br>
to find the instruction which caused the
segmentation fault.<br>
Most likely, there is some form of memory corruption
which you<br>
need to identify.<br>
<br>
It may be useful to setup FLASH without I/O (+noio)
and see if<br>
your simulation still fails. You can compare the
integrated<br>
quantities file (default name is flash.dat) with a
healthy<br>
simulation run on your local cluster to see if it is
working<br>
correctly.<br>
<br>
It may be useful to remove compiler optimization fla
gs and use<br>
-O0 to see if optimization is causing a problem.<br>
<br>
Good luck, <br>
Chris<br>
<br>
<br>
On 01/24/2013 03:51 AM, Marco Mazzuoli wrote:<br>
</b></div>
<b>
<blockquote
cite="mid:DUB108-W7ABC08C19CDA4CB3AEF6FAB140@phx.gbl">
<style><!--
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Tahoma;}
--></style>
<div dir="ltr"> Dear Christopher,<br>
<br>
thank you again. Of course, with the
multithreading up to 4 tasks per core (64 ranks
per node) using parallelization OpenMP you should
obtain the best performance because the threading
is still hardware, not software. Are there
routines which use also OpenMP libraries for
multithreading at the moment?<br>
Anyway, please you can find attached here the
setup_units file from my object directory. I have
heavily modified the Flash code (Physics) in order
to adapt it to my aim.<br>
However, the IO, the Poisson solver, the Paramesh4
AMR as well as the basic structure of Flash has
been kept unchanged.<br>
In particular, nothing has been changed of the
initialization (this why I am asking your help).<br>
<br>
I suppose the error I meet, comes from a writing
error on the BG/Q machine because the same code on
a Linux cluster machine works very well.<br>
If you had some further idea for my troubles
please let me know.<br>
Thank you very much, Christopher.<br>
<br>
Sincerely,<br>
<br>
Marco<br>
<pre><font style="font-size:10pt" color="#002060" size="2">
Ing. Marco Mazzuoli</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">Dipartimento di Ingegneria</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">delle Costruzioni, dell'Ambiente e</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">del Territorio (DICAT)</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">via Montallegro 1</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">16145 GENOVA-ITALY</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">tel. +39 010 353 2497</font><font style="font-size:10pt" color="#002060" size="2">
cell. +39 338
7142904
</font><font style="font-size:10pt" color="#002060" size="2">e-mail <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a></font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2"> <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli84@gmail.com">marco.mazzuoli84@gmail.com</a>
</font><font style="font-size:10pt" size="2"><img moz-do-not-send="true" alt=""></font>
</pre>
<br>
<br>
<div>
<hr id="ecxstopSpelling">Date: Wed, 23 Jan 2013
11:43:57 -0600<br>
From: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbreviated"
href="mailto:cdaley@flash.uchicago.edu">cdaley@flash.uchicago.edu</a><br>
To: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbr
eviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a><br>
Subject: Re: [FLASH-USERS] ERROR in
mpi_morton_bnd_prol<br>
<br>
<div class="ecxmoz-cite-prefix">It is failing
with a segmentation fault (signal 11).<br>
<br>
You should look at the stderr file and also
core file 948. There<br>
is a tool named bgq_stack which reports the
stack trace in a core<br>
file.<br>
<br>
bgq_stack ./flash4 core.948<br>
<br>
If unavailable you can translate the stack
addresses in this core<br>
file one at a time using addr2line.<br>
<br>
If the failing line is an innocent looking
piece of FLASH code<br>
then most like
ly there is some form of memory corruption
earlier<br>
on. Maybe there is something specific to your
simulation or<br>
perhaps your HDF5 needs to be recompiled
against the latest<br>
driver version on BG/Q.<br>
<br>
A good debugging method is to try to reproduce
the same error in<br>
a smaller problem so that you can then repeat
the run on a local<br>
workstation. Once on a local workstation you
can debug much more<br>
interactively and use excellent tools like
valgrind.<br>
<br>
Could you send me the setup_units from your
FLASH object<br>
directory? I'm curious what units you are
using. We have been<br>
doing a lot of work on the BG/Q recently
including multithreading<br>
the FLASH units needed for Supernova
simulations. Our Supernova<br>
simulations are currently running on Mira BG/Q
- we are using 16<br>
MPI ranks per node and 4 OpenMP threads per
MPI rank.<br>
<br>
Chris<br>
<br>
<br>
On 01/23/2013 11:19 AM, Marco Mazzuoli wrote:<br>
</div>
<blockquote
cite="mid:DUB108-W282F61DC478D8AF1D0AB52AB150@phx.gbl">
<style><!--
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Tahoma;}
--></style>
<div dir="ltr"> Thank you Christopher,<br>
<br>
indeed I could use the UG at this stage, but
I am t
esting the code on the bluegene machine in
order asap to make larger simulations which
need the AMR to be used.<br>
<br>
Actually, I have already solved the problem
by reducing the dimensions of the blocks
(16x16x16) and by introducing a finer
refinement level. But of course the solution
you proposed sounds better. I guess it is
better to use the largest blocks with the
smallest number of cores.<br>
<br>
If I can I would ask you an other question.
Just after the initialization, when the code
writes the first checkpoint file (at row54
of io_initFile.F90: "call
io_h5init_file(fileID, filename, io_comm,
io_outputSplitNum)"), the run crashes giving
the following message in the stdout
put:<br>
<br>
---------------------------------------------------------------------------------------------------------<br>
RuntimeParameters_read: ignoring unknown
parameter "igrav"...<br>
RuntimeParameters_read: ignoring unknown
parameter "mgrid_max_iter_change"...<br>
RuntimeParameters_read: ignoring unknown
parameter "mgrid_solve_max_iter"...<br>
RuntimeParameters_read: ignoring unknown
parameter "mgrid_print_norm"...<br>
Runtim eParameters_read: ignoring unknown
parameter "msgbuffer"...<br>
RuntimeParameters_read: ignoring unknown
parameter "eint_switch"...<br>
RuntimeParameters_read: ignoring unknown
parameter "order"...<br>
RuntimeParameters_read: ignoring unknown
parameter "slopeLimiter"...<br>
RuntimeParameters_read: ignoring unknown
parameter "LimitedSlopeBeta"...<br>
RuntimeParameters_read: ignoring unknown
parameter "charLimiting"...<br>
RuntimeParameters_read: ignoring unknown
parameter "use_avisc"...<br>
RuntimeParameters_read: ignoring unknown
parameter "use_flattening"...<br>
RuntimeParameters_read: ignoring unknown
parameter "use_steepening"...<br>
R untimeParameters_read: ignoring unknown
parameter "use_upwindTVD"...<br>
RuntimeParameters_read: ignoring unkno
wn parameter "RiemannSolver"...<br>
RuntimeParameters_read: ignoring unknown
parameter "entropy"...<br>
RuntimeParameters_read: ignoring unknown
parameter "shockDetect"...<br>
MaterialProperties initialized<br>
Cosmology initialized<br>
Source terms initialized<br>
iteration, no. not moved = 0 0<br>
refined: total leaf blocks = 1<br>
refined: total blocks = 1<br>
starting MORTON ORDERING<br>
tot_blocks after 1<br>
max_blocks 2 1<br>
min_blocks 2 0<br>
Finished initialising block: 1<br>
INFO: Grid_fillGuardCells is ignoring
masking.<br>
iteration, no. not moved = 0 0<br>
refined: total leaf blocks = 8<br>
refined: total blocks = 9<br>
iteration, no. not moved = 0 7<br>
iteration, no. not moved = 1 0<br>
refined: total leaf blocks = 64<br>
refined: total blocks = 73<br>
iteration, no. not moved = 0 70<br>
iteration, no. not moved = 1 7<br>
iteration, no. not moved = 2 0<br>
refined: total leaf blocks = 512<br>
refined: total blocks = 585<br>
iteration, no. not moved = 0 583<br>
iteration, no. not moved = 1 21<br>
iteration, no. not moved = 2 0<br>
refined: tot al leaf blocks = 4096<br>
refined: total blocks = 4681<br>
Finished initialising block: 5<br>
Finished initialising block: 6<br>
iteration, no. not moved = 0 904<br>
iteration, no. not moved = 1 0<br>
refined: total leaf blocks = 32768<br>
refined: total blocks = 37449<br>
Finished initialising block: 6<br>
Finished initialising block: 7<br>
Finished initialising block: 8<br>
Finished initialising block: 9<br>
Finished initialising block: 10<br>
Finished initialising block: 11<br>
Finished initialising block: 12<br>
&nbs
p;Finished initialising block: 13<br>
Finished initialising block: 15<br>
Finished initialising block: 16<br>
Finished initialising block: 17<br>
Finished initialising block: 18<br>
Finished initialising block: 19<br>
Finished initialising block: 20<br>
Finished initialising block: 21<br>
Finished initialising block: 22<br>
Finished initialising block: 24<br>
Finished initialising block: 25<br>
Finished initialising block: 26<br>
Finished initialising block: 27<br>
Finished initialising block: 28<br>
...<br>
...<br>
Finished with Grid_initDomain, no restart<br>
Ready to call Hydro_init<br>
Hydro initialized<br>
Gravity initialized<br>
Initial dt verified<br>
2013-01-23 17:43:12.006 (WARN )
[0x40000e98ba0] :97333:ibm.runjob.cli
ent.Job: terminated by signal 11<br>
2013-01-23 17:43:12.011 (WARN )
[0x40000e98ba0]
:97333:ibm.runjob.client.Job: abnormal
termination by signal 11 from rank 948<br>
---------------------------------------------------------------------------------------------------------<br>
<br>
In particular 8 cores call the subroutine
"io_h5init_file" before the run crashes (I
checked it during the debug).<br>
What do you think it could depend on?<br>
<br>
Thank you again, Christopher.<br>
Since
rely,<br>
<br>
Marco<br>
<pre><font style="font-size:10pt" color="#002060" size="2">
Ing. Marco Mazzuoli</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">Dipartimento di Ingegneria</font><font style="font-size:10pt" color="#002060" size="2">
<font style="font-size:10pt" color="#002060" size="2">delle Costruzioni, dell'Ambiente e</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">del Territorio (DICAT)</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">via Montallegro 1</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">16145 GENOVA-ITALY</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="" color="#002060" size="2">tel. +39 010 353 2497</font><font style="font-size:10pt" color="#002060" size="2">
cell. +39 338 7142904
</font><font style="font-size:10pt" color="#002060" size="2">e-mail <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a></font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="" color="#002060" size="2"> <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli84@gmail.com">marco.mazzuoli84@gmail.com</a>
</font><font style="font-size:10pt" size="2"><img moz-do-not-send="true" alt=""></font>
</font></pre>
<font style="font-size:10pt" color="#002060"
size="2"> <br>
<br>
<div>
<hr id="ecxstopSpelling">Date: Wed, 23
Jan 2013 10:16:20 -0600<br>
From: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbreviated"
href="mailto:cdaley@flash.uchicago.edu">cdaley@flash.uchicago.edu</a><br>
To: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbreviated"
href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a><br>
CC: <a moz-do-not-send="true"
class="ecxmoz-txt-link-abbreviated"
href="mailto:flash-users@flash.uchicago.edu">flash-users@flash.uchicago.ed
u</a><br>
Subject: Re: [FLASH-USERS] ERROR in
mpi_morton_bnd_prol<br>
<br>
<div class="ecxmoz-cite-prefix">Hi
Marco,<br>
<br>
You should increase maxblocks because
a value of maxblocks=4 is too<br>
low. Y
ou are using large blocks (32^3) and
so memory usage prevents<br>
you setting maxblocks too high, but I
would suggest for this problem<br>
you need a value of at least 10 for
maxblocks.<br>
<br>
The specific error you show is
happening because data cannot be found<br>
in a array of size maxblocks_alloc,
where the default value of<br>
maxblocks_alloc is maxblocks * 10.
Inte rnally Paramesh has many<br>
arrays of size maxblocks_alloc which h
old e.g. various information<br>
about the local view of the oct-tree.
A technique we have used in the<br>
past when there is insufficie
nt memory to make maxblocks much
larger<br>
and we need to avoid errors like you
show is to make maxblocks_alloc<br>
larger in amr_initialize.F90, e.g.
maxblocks_alloc = maxblocks * 20.<br>
You should also change maxblocks_tr to
be the same size as<br>
maxblocks_alloc.<br>
<br>
Finally, if you don't need AMR then
you should use the FLASH uniform<br>
grid (you have a fully refined domain
at level 5). Memory usage will<br>
be less and guard cell fills will be
much faster.<br>
<br>
Chris<br>
<br>
<br>
On 01/18/2013 10:47 AM, Marco Mazzuoli
wrote:<br>
</div>
<style><!--
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Tahoma;}
--></style>
<div dir="ltr"> Dear Flash users,<br>
<br>
I am trying to run Flash on a
bluegene-type supercomputer.<br>
The details of the present run are:<br>
<br>
#procs=1024 on #64 nodes (#16procs per
node).<br>
<br>
The domain is rectangular.<br>
Block size = 32x32x32 computational
points<br>
Max refinement level = 5<br>
The whole domain is refined at level =
5 such that N°bloc
ks=1+8+64+512+4096=4681<br>
Max_blocks per core = 4<br>
<br>
Do you know what the initialization
error visualized in the standard
output and proposed in the following,
could depend on?<br>
<br>
-----------------------------------------------------------------------------------------------------------------------<br>
& nbsp;RuntimeParameters_read:
ignoring unknown parameter "igrav"...<br>
RuntimeParameters_read: ignoring
unknown parameter
"mgrid_max_iter_change"...<br>
RuntimeParameters_read: ignoring
unknown parameter
"mgrid_solve_max_iter"...<br>
RuntimeParameters_read: ignoring
unknown para
meter "mgrid_print_norm"...<br>
RuntimeParameters_read: ignoring
unknown parameter "msgbuffer"...<br>
RuntimeParameters_read:&n bsp;
ignoring unknown parameter
"eint_switch"...<br>
RuntimeParameters_read: ignoring
unknown parameter "order"...<br>
RuntimeParameters_read: ignoring
unknown parameter "slopeLimiter"...<br>
RuntimeParameters_read: ignoring
unknown parameter
"LimitedSlopeBeta"...<br>
RuntimeParameters_read: ignoring
unknown parameter "charLimiting"...<br>
RuntimeParameters_read: ignoring
unknown parameter "use_avisc"...<
br> RuntimeParameters_read:
ignoring unknown parameter
"use_flattening"...<br>
RuntimeParameters_read: ignoring
unknown parameter "use_steepening"...<br>
RuntimeParameters_read: ignoring
unknown parameter "use_upwindTVD"...<br>
RuntimeParameters_read: ignoring
unknown parameter "RiemannSolver"...<br>
RuntimeParameters_read: ignoring
unknown parameter "entropy"...<br>
RuntimeParameters_read: ignoring
unknown parameter "shockDetect"...<br>
MaterialProperties initialized<br>
Cosmology initialized<br>
Source terms initialized<b>
Cosmology initialized<br>
Source terms initialized<br>
iteration, no. not moved = 0 0<br>
refined: total leaf blocks =&
nbsp; 1<br>
refined: total blocks = 1<br>
starting MORTON ORDERING<br>
tot_blocks after 1<br>
max_blocks 2 1<br>
min_blocks 2 0<br>
Finished initialising block: 1<br>
INFO: Grid_fillGuardCells is
ignoring masking.<br>
iteration, no. not moved = 0 0<br>
refined: total leaf blocks = 8<
br> refined: total blocks = 9<br>
iterati
on, no. not moved = 0 7<br>
iteration, no. not moved = 1 0<br>
refined: total leaf blocks = 64<br>
refined: total blocks = 73<br>
iteration, no. not moved = 0 70<br>
iteration, no. not moved = 1 7<br>
iteration, no. not moved = 2 0<br>
refined: total leaf blocks = 512<br>
refined: total blocks = 585<br>
ERROR in mpi_morton_bnd_
prol : guard
block starting index -3 not larger
than lnblocks 1 processor no. 8
maxblocks_alloc 40<br>
ERROR in
mpi_morton_bnd_prol&nbs
p; : guard block
starting index -3 not larger than
lnblocks 1 processor no. 496
maxblocks_alloc 40<br>
ERROR in
mpi_morton_bnd_prol
: guard block starting index -3 not
larger than lnblocks 1 processor
no. 569 maxblocks_alloc 40<br>
ERROR in
mpi_morton_bnd_prol &n
bsp; &nb
sp; : guard block starting index
-3 not larger than lnblocks 1
processor no. 172 maxblocks_alloc
40<br>
ERROR in mpi_morton_bnd_prol
: guard block
starting index -12 not larger than
lnblocks 1 processor no. 368
maxblocks_alloc 40<br>
ERROR in
mpi_morton_bnd_prol
: guard block starting index -12
not larger than lnblocks 1
processor no. 189 maxblocks_all
oc 40<br>
...<br>
...<br>
...<br>
Abort(1076419107) on node 442 (rank
442 in comm -2080374784):
application called MPI_A
bort(comm=0x84000000, 1076419107) -
process 442<br>
-----------------------------------------------------------------------------------------------------------------------<br>
<br>
Thank you in advance.<br>
<br>
Sincerely,<br>
<br>
Marco Mazzuoli<br>
<pre><font style="font-size:10pt" color="#002060" size="2">
Ing. Marco Mazzuoli</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">Dipartimento di Ingegneria</font><font color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">delle Costruzioni, dell'Ambiente e</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">del Territorio (DICAT)</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">via Montallegro 1<font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">16145 GENOVA-ITALY</font><font style="font-size:10pt" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="2">tel. +39 010 353 2497</font><font style="font-size:10pt" color="#002060" size="2">
cell. +39 338 7142904
</font><font style="font-size:10pt" color="#002060" size="2">e-mail <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli@unige.it">marco.mazzuoli@unige.it</a></font><font style="font-size:10p\00000d\00000at" color="#002060" size="2">
</font><font style="font-size:10pt" color="#002060" size="
2"> <a moz-do-not-send="true" class="ecxmoz-txt-link-abbreviated" href="mailto:marco.mazzuoli84@gmail.com">marco.mazzuoli84@gmail.com</a>
</font><font style="font-size:10pt" size="2"><img moz-do-not-send="true" alt=""></font>
</font></pre>
<font style="font-size:10pt"
color="#002060" size="2"> </font></b></div>
<b> <font style="font-size:10pt"
color="#002060" size="2"> </font></b></div>
<b> </b></font></div>
<b> <font style="font-size:10pt"
color="#002060" size="2"> </font></b></blockquote>
<b> <font style="font-size:10pt"
color="#002060" size="2"> <font
style="font-size:10pt" color="#002060"
size="2"> <br>
< /font></font></font></b></div>
<b> <font style="font-size:10pt" color="#002060"
size="2"> <font style="font-size:10pt"
color="#002060" size="2"> <br>
</font> </font></b></div>
<b> <font style="font-size:10pt" color="#002060"
size="2"> </font></b></blockquote>
<b> <font style="font-size:10pt" color="#002060"
size="2"> <br>
</font></b></b></div>
<b><b> </b></b></div>
<b><b> </b></b></blockquote>
<b><b> <br>
</b></b></div>
</div>
</blockquote>
<br>
</body>
</html>