<div dir="ltr">Hi Zhao,<div><br></div><div>I'm having some trouble understanding exactly the cases you're comparing. If you attach logfiles for each case, that should clear things up. </div><div><br></div><div>More generally, using more than one node / more processors increases the communication time so if your problem doesn't scale well (e.g., if you've written a lot of MPI_BCAST or MPI_ALL_REDUCE calls) then using more processors can result in a slower solution time.</div><div><br></div><div>Best,<br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr">--------<div>Ryan</div></div></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 17, 2022 at 7:19 AM 赵旭 <<a href="mailto:xuzhao1994@sjtu.edu.cn">xuzhao1994@sjtu.edu.cn</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear all,<br>
<br>
sorry about the typo,<br>
<br>
'' the results show that in case 2) it takes double or even more time than 1) ''<br>
<br>
that is when i use more than 1 node, it spends more time.<br>
<br>
<br>
----- 原始邮件 -----<br>
发件人: "赵旭" <<a href="mailto:xuzhao1994@sjtu.edu.cn" target="_blank">xuzhao1994@sjtu.edu.cn</a>><br>
收件人: "flash-users" <<a href="mailto:flash-users@flash.rochester.edu" target="_blank">flash-users@flash.rochester.edu</a>><br>
发送时间: 星期五, 2022年 6 月 17日 上午 11:39:48<br>
主题: [FLASH-USERS] MORE nodes MORE time on HPC<br>
<br>
Dear FLASH user & developers,<br>
<br>
I have a question about running FLASH code on HPC. I am running a modified laserslab case (changed .par and initBlock.F90 for laser and target) from the default one. I tried that <br>
1) running on 1 nodes with 40 cores (1 nodes contains 40 cores) and <br>
2) running on 2 nodes with 80 cores with the same setup and parameters in 1)<br>
<br>
the results show that in case 1) it takes double or even more time than 1), and this seems counter intuitive. Because if I run a case with larger simulation box or with fine resolution I have to using more cores.<br>
<br>
I tried two HPC, as a) 1 nodes = 40 cores with total 192G memory, and b) 1 nodes = 64 cores total 512G memory. It takes 2 times in 2 nodes in a) and nearly 5 times in 2 nodes in b) <br>
<br>
I dont know if this problem comes from the setting related to HPC (like mpi and hypre version , or job systerm) or setting related to FLASH code(like in some source code files)<br>
<br>
I use gcc 7.5, python3.8, mpich 3.3.2, hypre 2.11.2, hdf5 1.10.5.<br>
<br>
both HPCs use slurm job systerm and like below<br>
<br>
#!/bin/bash<br>
<br>
#SBATCH --job-name= # Name<br>
#SBATCH --partition=64c512g # cpu <br>
#SBATCH -n 128 # total cpu <br>
#SBATCH --ntasks-per-node=64 # cpu/node<br>
#SBATCH --output=%j.out<br>
#SBATCH --error=%j.err <br>
<br>
mpirun ./flash4 >laser_slab.log<br>
<br>
I would appreciate any help.<br>
<br>
Thanks !<br>
<br>
-- <br>
Zhao Xu<br>
Laboratory for Laser Plasmas (MoE)<br>
Shanghai Jiao Tong University<br>
800 Dongchuan Rd, Shanghai 200240<br>
_______________________________________________<br>
flash-users mailing list<br>
<a href="mailto:flash-users@flash.rochester.edu" target="_blank">flash-users@flash.rochester.edu</a><br>
<br>
For list info, including unsubscribe:<br>
<a href="https://flash.rochester.edu/mailman/listinfo/flash-users" rel="noreferrer" target="_blank">https://flash.rochester.edu/mailman/listinfo/flash-users</a><br>
-- <br>
Zhao Xu<br>
Laboratory for Laser Plasmas (MoE)<br>
Shanghai Jiao Tong University<br>
800 Dongchuan Rd, Shanghai 200240<br>
_______________________________________________<br>
flash-users mailing list<br>
<a href="mailto:flash-users@flash.rochester.edu" target="_blank">flash-users@flash.rochester.edu</a><br>
<br>
For list info, including unsubscribe:<br>
<a href="https://flash.rochester.edu/mailman/listinfo/flash-users" rel="noreferrer" target="_blank">https://flash.rochester.edu/mailman/listinfo/flash-users</a><br>
</blockquote></div>