[FLASH-USERS] Flash3 hanging in refinement
Dave
david.john.williamson at gmail.com
Mon May 14 12:14:05 EDT 2012
Hi,
I would have thought in that case that it would have triggered the error
message and quit, but it's possible it hung while traying to abort. I
should be able to support more than 173352 blocks with my settings for
MAXBLOCKS & my number of processors, but I guess that number is just the
number of blocks that are being moved between threads, not the total
number of blocks in the system?
-David
On 14/05/2012 12:53 p.m., Seyit Hocuk wrote:
> Hi,
>
> It seems like you do not have enough blocks per cpu, i.e., it is
> reaching the maximum number of blocks. Either increase the number
> MAXBLOCKS (while compiling, or in the files Makefile & Flash.h and
> remake the directory), or increase the number of processors you are
> using.
>
> Best,
> Seyit
>
>
>
> On 05/14/2012 05:50 PM, Dave wrote:
>> Hi,
>>
>> I am finding that FLASH hangs in the refinement step. The final
>> output in the log file is:
>>
>> ----
>> [ 05-13-2012 03:56:30.023 ] step: n=2132 t=4.078276E+12
>> dt=6.783726E+08
>> [ 05-13-2012 04:00:08.019 ] [mpi_amr_comm_setup]:
>> buffer_dim_send=1519097, buffer_dim_recv=1485373
>> [ 05-13-2012 04:00:38.738 ] [GRID amr_refine_derefine]: initiating
>> refinement
>> ----
>>
>> and the final output in the standard dump is:
>>
>> ----
>> iteration, no. not moved = 0 173352
>> iteration, no. not moved = 1 169097
>> iteration, no. not moved = 2 165742
>> iteration, no. not moved = 3 162432
>> <etc etc>
>> iteration, no. not moved = 98 19621
>> iteration, no. not moved = 99 18939
>> iteration, no. not moved = 100 18282
>> ----
>>
>> The code isn't crashing, and it appears that the processors are still
>> running according to qstat etc, but it hasn't produced any output at
>> all in over 24 hours.
>>
>> It looks like it isn't redistributing blocks properly, but I would
>> have assumed that would trigger an error message in
>> mpi_amr_redist_blk.F90 :
>>
>> ----
>> if (nm2_old.eq.nm2.and.nm2.ne.0.and.nit>=100) then
>> if (mype.eq.0) then
>> print *,' ERROR: could not move all blocks in amr_redist_blk'
>> print *,' Try increasing maxblocks or use more processors'
>> print *,' nm2_old, nm2 = ',nm2_old,nm2
>> print *,' ABORTING !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
>> end if
>> call MPI_ABORT(MPI_COMM_WORLD,errorcode,ierr)
>> end if
>> ----
>>
>> Any clues or suggestions?
>>
>> Cheers,
>> David Williamson
>>
>> PhD Candidate
>> St. Mary's University
>> Halifax, NS
>
More information about the flash-users
mailing list