[FLASH-USERS] Flash4b hangs on BlueGene/Q system

Christopher Daley cdaley at flash.uchicago.edu
Wed Nov 21 12:42:36 EST 2012


Hi Mirko,

We are able to run FLASH simulations on the Argonne BG/Q.

I would double check that you have no '-pg' in your compile
lines.  Maybe you also have custom static pattern rules in your
Makefile.h that contain '-pg'?

Based on your stacktrace you should see _mcount symbol in your
Flash.o and flash4 binary, i.e.

$ nm -A Flash.o flash4 | egrep '( _mcount$|__mcount_internal$)'
Flash.o:                 U _mcount
flash4:00000000023a47e0 D __mcount_internal
flash4:00000000023a3628 D _mcount
$

I can only get these symbols by compiling Flash.F90 with -pg.
If I remove -pg then I get
$ nm -A Flash.o flash4 | egrep '( _mcount$|__mcount_internal$)'
$

You should create a flash4 binary that has no reference to
_mcount and then repeat your flash4 run.


(The only "embedded" profiling in FLASH4 needs to be explicitly
included in your FLASH application at setup time using
-unit=monitors/Profiler/ProfilerMain/mpihpm.)

Chris


On 11/21/2012 09:10 AM, Mirko Cestari wrote:
> Dear users,
> we are experiencing some issues in trying to run Flash4b on
> our BGQ system if compiled with native xl compilers.
> We didn't experience any problem on our previous
> BG/P system (we compiled and run successfully the same code/input).
>
> The programs seems to hang indefinitely at the very beginning of
> the simulation run. Checking with a debugging tool (totalview) the stack
> trace turns out to be
>
> Stack Trace
> C    __mcount_internal,     FP=19ffffb980
>       ._mcount,              FP=19ffffba00
> f90  flash,                 FP=19ffffbaa0
>       .generic_start_main,   FP=19ffffbd80
> C    __libc_start_main,     FP=19ffffbe40
>
> the execution stops at
>
>    =>  program Flash
>
> in Flash.F90, more precisely, the execution runs in an infinite (while) loop
> in the function  __mcount_internal in mcount.c
>
>   =>            while (atomic_compare_and_exchange_bool_acq (&p->mcount_hwthd, hwthd, -1));
>
> which to my understanding is a profiling function (please note
> no profiling flags have been used to compile the program, is there
> any profiling embedded in the code?).
>
> Compiling with gcc gives worse performance but no "freezing" problems.
>
> Have you ran into similar problems on BG/Q systems? Can you point
> me to someone that might have encountered the same problem?
>
> Thanks in advance,
> Mirko
>
> --
> Mirko Cestari, PhD
> m.cestari at cineca.it
> CINECA - SuperComputing Applications and Innovation Department - SCAI
> via Magnanelli, 6/3 40033 Casalecchio di Reno (Bologna) - ITALY
> www.cineca.it
>





More information about the flash-users mailing list