<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Hello,</div><div class=""><br class=""></div><div class=""> My current FLASH build worked fine on the original stampede, and on small local clusters. But on both KNL and SKX nodes on Stampede2, I get a hang during refinement in mpi_amr_redist_blk. If I build the initial simulation on a different cluster, then the hang happens on Stampede2 the first time the grid structure changes. If I build the initial simulation on Stampede2, it hangs after triggering level 6 in that initial Simulation_initBlk loop, but still in mpi_amr_redist_blk. </div><div class=""><br class=""></div><div class="">Setup call: </div><div class=""> ./setup -auto -3d -nxb=32 -nyb=16 -nzb=8 -maxblocks=200 species=rock,watr +uhd3tr mgd_meshgroups=1 Simulation_Buiild</div><div class=""><br class=""></div><div class=""> Using: ifort (IFORT) 17.0.4 20170411 </div><div class=""><br class=""></div><div class="">Log file tail in file Logfile.pdf</div><div class=""><br class=""></div><div class="">I’ve change maxblocks, and number of nodes, without getting out of this issue. </div><div class=""><br class=""></div><div class="">I’ve changed the “iteration, no. not moved” output to occur for each processor, and they all print out the identical correct info. I’ve added per processor print statements before the nrecv>0 waitall and nsend>0 waitall in mpi_amr_redist_blk.F90 and see that about 25% of processors are waiting indefinitely in the nrecv>0 waitall, while the other 75% complete the resist_blk subroutine and are waiting later for the remaining processors to finish. </div><div class=""><br class=""></div><div class="">I’ve tried adding sleep(1) inside the niter loop, as suggested in the past for someone who found niter going to 100 (note, I’m getting niter = 2 with no. not move=0, so all processors successfully exit that loop but hang later). This didn’t change the result.</div><div class=""><br class=""></div><div class="">Has anyone else seen similar hanging occurring, on any cluster? Any suggestions for overcoming this hang event? </div><div class=""><br class=""></div><div class="">Thank you for your help,</div><div class=""> -Mark</div><div class=""><br class=""></div><div class=""> </div></body></html>