[FLASH-USERS] Scaling problems

pedro romero pedro.romero at greentownsbyfusion.com
Thu Apr 6 09:33:34 EDT 2023


Thank you very much for your replies. I will try with a higher number of zones per processor and if the sigbus error persists after some tests, I will message again.

Thank you all, once again.

Kind regards,

Pedro

De: flash-users <flash-users-bounces at flash.rochester.edu> En nombre de Hansen, Eddie
Enviado el: jueves, 6 de abril de 2023 1:56
Para: Ryan Farber <rjfarber at umich.edu>; Leland Ellison <c.leland.ellison at gmail.com>
CC: flash-users at flash.rochester.edu
Asunto: Re: [FLASH-USERS] Scaling problems

Hello all,

Another point about running with uniform grid…

If you use +ug, then you are setting a fixed block size during the setup. You then have to set these parameters in your par file: iProcs, jProcs, kProcs. As mentioned, these determine the number of blocks and the number of processors you must use.

You can instead run setup with +ug +nofbs. This is “no fixed block size”. In this case, you set these parameters in your par file: IGridSize, jGridSize, kGridSize (in addition to iProcs, jProcs, kProcs). Now the ‘Procs’ parameters still need to match the number of processors being used, but the block size is determined by iGridSize/iProcs, jGridSize/jProcs, kGridSize/kProcs. So, in some sense, this might make it simpler to do strong or weak scaling.

For strong scaling, keep the ‘GridSize’ parameters constant and increase the ‘Procs’ parameters. Just don’t increase ‘Procs’ so much that your block size becomes < 8 cells (or start with larger ‘GridSize’ values).

For weak scaling, increase the ‘GridSize’ and ‘Procs’ parameters so that GridSize/Procs remains constant.

--
Eddie Hansen
Research Scientist
Flash Center for Computational Science


From: flash-users <flash-users-bounces at flash.rochester.edu<mailto:flash-users-bounces at flash.rochester.edu>> on behalf of Ryan Farber <rjfarber at umich.edu<mailto:rjfarber at umich.edu>>
Date: Wednesday, April 5, 2023 at 2:34 PM
To: Leland Ellison <c.leland.ellison at gmail.com<mailto:c.leland.ellison at gmail.com>>
Cc: flash-users at flash.rochester.edu<mailto:flash-users at flash.rochester.edu> <flash-users at flash.rochester.edu<mailto:flash-users at flash.rochester.edu>>
Subject: Re: [FLASH-USERS] Scaling problems
Thanks to Lee for pointing this out and sorry to Pedro for glossing over your initial message where you mention that you tune iprocs, jprocs, nxb, and nyb to have about the same number of grid points (so indeed looking at strong scaling rather than weak scaling).

One more point regarding the sigbus issue is that perhaps adding mcmodel=large in your FFLAGS_* (in your Makefile.h) will help -- if you're on an Intel system anyway this enforces absolute addressing whereas the default mcmodel=small uses relative addressing. From what I recall I had a ~2% performance decrease switching from mcmodel=large (but from mcmodel=medium) but it helped with memory issues in the past at large core counts.

Best,
--------
Ryan


On Wed, Apr 5, 2023 at 5:06 PM Ryan Farber <rjfarber at umich.edu<mailto:rjfarber at umich.edu>> wrote:
Hi Pedro,

One point of follow-up regarding your sigbus error - it looks like this is a memory access error. I'm wondering if you're requesting more logical cores than physical cores exist on the machine. I've found that to be problematic in the past.

It sounds like Lee might have the answer to your issue regarding too few zones per proc. One point I'm confused about though is whether you're studying strong or weak scaling. Based on your response to Paco and I checked the FLASH users guide, for uniform grid mode there's one block per processor -- and the number of zones per block are fixed at compile time (usually) so doesn't that mean you're increasing the amount of work proportional to the number of processors you use? In that case, seeing a constant "speedup" suggests good weak scaling.

Best,
--------
Ryan


On Wed, Apr 5, 2023 at 4:17 PM Leland Ellison <c.leland.ellison at gmail.com<mailto:c.leland.ellison at gmail.com>> wrote:
Hi Pedro,

I suspect your per-block zone counts are too low to see benefits of adding more procs at this point. The details will depend on your specific problem and hardware of course, but when I've done strong scaling studies I've found (rapidly) diminishing returns to adding more procs once I fall below about ~1000 zones per proc. I think this is what you're seeing in your nxb=nxy=16 and nxb=nxy=28 cases. If your scaling study continues up to several thousand zones per proc, I bet you'll see more of the expected behavior.

Hope this helps!
Lee

 ________________
Leland Ellison PhD
Computational Physicist
https://www.linkedin.com/in/clelandellison/
https://scholar.google.com/citations?user=1rfcVWgAAAAJ

On Wed, Apr 5, 2023 at 6:46 AM pedro romero <pedro.romero at greentownsbyfusion.com<mailto:pedro.romero at greentownsbyfusion.com>> wrote:
Hi Paco,

As far as I know, using uniform grid fixes the number of blocks as one per processor. Am I wrong? Do you mean to fix nxb and nyb while varying the cores?

De: Francisco Holguin <opaco at umich.edu<mailto:opaco at umich.edu>>
Enviado el: miércoles, 5 de abril de 2023 15:17
Para: pedro romero <pedro.romero at greentownsbyfusion.com<mailto:pedro.romero at greentownsbyfusion.com>>
CC: flash-users at flash.rochester.edu<mailto:flash-users at flash.rochester.edu>
Asunto: Re: [FLASH-USERS] Scaling problems

Hi Pedro,

What if you just fix the number of blocks, and vary the cores?

-Paco

On Wed, Apr 5, 2023 at 5:13 AM pedro romero <pedro.romero at greentownsbyfusion.com<mailto:pedro.romero at greentownsbyfusion.com>> wrote:
Hi all,

I am trying to scale up on computational resources and I came across a few issues. First of all, I am running the same +ug example (a modification of 2D Zpinch template) varying the number of cores, nxb and nyb but it shows no speed up as the number of cores increase (I am tuning Iprocs, Jprocs, nxb and nyb to always get an approximately equal grid).

Furthermore, at a certain number of cores the program execution interrupts, and I get a SIGBUS error (which I attach to this message). Am I missing something? Is there any additional thing to consider?

I will also attach the log file of one successful run using 36 cores and nxb=nyb=16 (which shows little or no speed up in comparison to a run on 12 cores and nxb=nyb=28). Thank you in advance for any help.

Pedro


_______________________________________________
flash-users mailing list
flash-users at flash.rochester.edu<mailto:flash-users at flash.rochester.edu>

For list info, including unsubscribe:
https://flash.rochester.edu/mailman/listinfo/flash-users
_______________________________________________
flash-users mailing list
flash-users at flash.rochester.edu<mailto:flash-users at flash.rochester.edu>

For list info, including unsubscribe:
https://flash.rochester.edu/mailman/listinfo/flash-users
_______________________________________________
flash-users mailing list
flash-users at flash.rochester.edu<mailto:flash-users at flash.rochester.edu>

For list info, including unsubscribe:
https://flash.rochester.edu/mailman/listinfo/flash-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20230406/a718145a/attachment.htm>


More information about the flash-users mailing list