[FLASH-USERS] Flash4b hangs on BlueGene/Q system
Mirko Cestari
m.cestari at cineca.it
Fri Nov 23 04:11:48 EST 2012
Dear Chris,
thanks a lot for your answer. You probably pointed out the
problem here. Indeed I have the __mcount_internal
in my executable:
nm flash4 | egrep '( _mcount$|__mcount_internal$)'
0000000001904e78 D __mcount_internal
00000000019039f0 D _mcount
This is an example of how an object is compiled and the application linked
============
mpixlf77_r -O0 -g -qstrict -q64 -qzerosize -c -cpp -qsuffix=cpp=F:cpp=F90 -qfree=f90 -WF,-DMAXBLOCKS=100 -WF,-DNXB=8 -WF,-DNYB=8 -WF,-DNZB=8 -WF,-DN_DIM=3 rp_getOpt.F90
mpixlf77_r -o flash4 Burn.o Burn_computeDt.o Burn_finalize.o ...
==============================
I use no "-pg" argument, so I don't really understand where the __mcount_internal
function comes from.
Mirko
--
Mirko Cestari, PhD
m.cestari at cineca.it
CINECA - SuperComputing Applications and Innovation Department - SCAI
via Magnanelli, 6/3 40033 Casalecchio di Reno (Bologna) - ITALY
www.cineca.it
----- Original Message -----
From: "Christopher Daley" <cdaley at flash.uchicago.edu>
To: "Mirko Cestari" <m.cestari at cineca.it>
Cc: flash-users at flash.uchicago.edu
Sent: Wednesday, November 21, 2012 6:42:36 PM
Subject: Re: [FLASH-USERS] Flash4b hangs on BlueGene/Q system
Hi Mirko,
We are able to run FLASH simulations on the Argonne BG/Q.
I would double check that you have no '-pg' in your compile
lines. Maybe you also have custom static pattern rules in your
Makefile.h that contain '-pg'?
Based on your stacktrace you should see _mcount symbol in your
Flash.o and flash4 binary, i.e.
$ nm -A Flash.o flash4 | egrep '( _mcount$|__mcount_internal$)'
Flash.o: U _mcount
flash4:00000000023a47e0 D __mcount_internal
flash4:00000000023a3628 D _mcount
$
I can only get these symbols by compiling Flash.F90 with -pg.
If I remove -pg then I get
$ nm -A Flash.o flash4 | egrep '( _mcount$|__mcount_internal$)'
$
You should create a flash4 binary that has no reference to
_mcount and then repeat your flash4 run.
(The only "embedded" profiling in FLASH4 needs to be explicitly
included in your FLASH application at setup time using
-unit=monitors/Profiler/ProfilerMain/mpihpm.)
Chris
On 11/21/2012 09:10 AM, Mirko Cestari wrote:
> Dear users,
> we are experiencing some issues in trying to run Flash4b on
> our BGQ system if compiled with native xl compilers.
> We didn't experience any problem on our previous
> BG/P system (we compiled and run successfully the same code/input).
>
> The programs seems to hang indefinitely at the very beginning of
> the simulation run. Checking with a debugging tool (totalview) the stack
> trace turns out to be
>
> Stack Trace
> C __mcount_internal, FP=19ffffb980
> ._mcount, FP=19ffffba00
> f90 flash, FP=19ffffbaa0
> .generic_start_main, FP=19ffffbd80
> C __libc_start_main, FP=19ffffbe40
>
> the execution stops at
>
> => program Flash
>
> in Flash.F90, more precisely, the execution runs in an infinite (while) loop
> in the function __mcount_internal in mcount.c
>
> => while (atomic_compare_and_exchange_bool_acq (&p->mcount_hwthd, hwthd, -1));
>
> which to my understanding is a profiling function (please note
> no profiling flags have been used to compile the program, is there
> any profiling embedded in the code?).
>
> Compiling with gcc gives worse performance but no "freezing" problems.
>
> Have you ran into similar problems on BG/Q systems? Can you point
> me to someone that might have encountered the same problem?
>
> Thanks in advance,
> Mirko
>
> --
> Mirko Cestari, PhD
> m.cestari at cineca.it
> CINECA - SuperComputing Applications and Innovation Department - SCAI
> via Magnanelli, 6/3 40033 Casalecchio di Reno (Bologna) - ITALY
> www.cineca.it
>
More information about the flash-users
mailing list