[FLASH-USERS] Flash3 hanging in refinement

Dave david.john.williamson at gmail.com
Mon May 14 12:14:05 EDT 2012


Hi,

I would have thought in that case that it would have triggered the error 
message and quit, but it's possible it hung while traying to abort. I 
should be able to support more than 173352 blocks with my settings for 
MAXBLOCKS & my number of processors, but I guess that number is just the 
number of blocks that are being moved between threads, not the total 
number of blocks in the system?

-David

On 14/05/2012 12:53 p.m., Seyit Hocuk wrote:
> Hi,
>
> It seems like you do not have enough blocks per cpu, i.e., it is 
> reaching the maximum number of blocks. Either increase the number 
> MAXBLOCKS (while compiling, or in the files Makefile & Flash.h and 
> remake the directory), or increase the number of processors you are 
> using.
>
> Best,
> Seyit
>
>
>
> On 05/14/2012 05:50 PM, Dave wrote:
>> Hi,
>>
>> I am finding that FLASH hangs in the refinement step. The final 
>> output in the log file is:
>>
>> ----
>>  [ 05-13-2012  03:56:30.023 ] step: n=2132 t=4.078276E+12 
>> dt=6.783726E+08
>>  [ 05-13-2012  04:00:08.019 ] [mpi_amr_comm_setup]: 
>> buffer_dim_send=1519097, buffer_dim_recv=1485373
>>  [ 05-13-2012  04:00:38.738 ] [GRID amr_refine_derefine]: initiating 
>> refinement
>> ----
>>
>> and the final output in the standard dump is:
>>
>> ----
>>   iteration, no. not moved =            0      173352
>>   iteration, no. not moved =            1      169097
>>   iteration, no. not moved =            2      165742
>>   iteration, no. not moved =            3      162432
>> <etc etc>
>>   iteration, no. not moved =           98       19621
>>   iteration, no. not moved =           99       18939
>>   iteration, no. not moved =          100       18282
>> ----
>>
>> The code isn't crashing, and it appears that the processors are still 
>> running according to qstat etc, but it hasn't produced any output at 
>> all in over 24 hours.
>>
>> It looks like it isn't redistributing blocks properly, but I would 
>> have assumed that would trigger an error message in 
>> mpi_amr_redist_blk.F90 :
>>
>> ----
>>       if (nm2_old.eq.nm2.and.nm2.ne.0.and.nit>=100) then
>>          if (mype.eq.0) then
>>           print *,' ERROR: could not move all blocks in amr_redist_blk'
>>           print *,' Try increasing maxblocks or use more processors'
>>           print *,' nm2_old, nm2 = ',nm2_old,nm2
>>           print *,' ABORTING !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
>>          end if
>>          call MPI_ABORT(MPI_COMM_WORLD,errorcode,ierr)
>>       end if
>> ----
>>
>> Any clues or suggestions?
>>
>> Cheers,
>> David Williamson
>>
>> PhD Candidate
>> St. Mary's University
>> Halifax, NS
>




More information about the flash-users mailing list