<div dir="ltr">Dear all,<div>I was trying to replicate the multithreaded runs of the Multiploe Solver in section 43.5.1, figure 43.1 of FLASH 4.8 User's Guide. I compiled the AMR "threadBlockList" case using the following command:</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">./setup unitTest/Gravity/Poisson3 -auto -3d -maxblocks=600 -opt +pm4dev +newMpole +noio threadBlockList=True timeMultipole=True</blockquote><div><br></div><div>and submitted the jobs to the cluster using the following Slurm commands:</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">#SBATCH --nodes=1<br>#SBATCH --ntasks=1<br>#SBATCH --cpus-per-task=<T></blockquote><div><br></div><div>Where <T> was changing from 1 to 8 for each job submitted to the cluster. The flash.log file for <T> = 3 case, for example, verifies that OpenMP threads/MPI task is actually equal to 3:</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Number of MPI tasks:                  1<br> MPI version:                          3<br> MPI subversion:                       1<br> MPI thread support:                   T<br> OpenMP threads/MPI task:              3<br> OpenMP version:                       201511<br> Is "_OPENMP" macro defined:           T </blockquote><div><br></div><div>I get the following speedup results:</div><div></div><img src="cid:ii_mhp1sbcg1" alt="image.png" width="503" height="342" style="margin-right: 0px;"><br><div></div><div>While the User's Guide shows the following speedup (Figure 43.1):</div><div><img src="cid:ii_mhp1ubx32" alt="image.png" width="503" height="377"></div><div><br></div><div>I would appreciate any help in resolving this issue in my simulations.</div><div><br></div><div>Kind regards,</div><div>Seyed<br></div></div>