[FLASH-USERS] Problem linked to I/O on Lustre filesystem

Bertini, Denis Dr. D.Bertini at gsi.de
Thu May 23 14:43:28 EDT 2024


Dear FLASH developpers,


We  are running the LaserSlab simulation ( FLASH 4.8 ) on Lustre Filesystem v2.12.5.

When the number of mpi tasks / ode exceed a certain threshold ( typically 100 mpi taks )

the flash4 process hangs during I/O and the lustre working directory in use become not

accessible anymore.

The remaining flash4 are not killable via standard signal and we have to reboot the node in order to

free the ressources.

Do you have any experience of such a problem with Lustre Filesystem ?



Thanks in advance for any help.

Best regards,

Denis



PS: we use the option +hdf5iotype so we are using MPI collective I/O

In attachment to this mail i added a Darshan report about the FLASH I/O performance

on our filesystem



---------
Denis Bertini
Abteilung: CIT
Ort: SB3 2.265a

Tel: +49 6159 71 2240
Fax: +49 6159 71 2986
E-Mail: d.bertini at gsi.de

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
Chairman of the GSI Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
Ministerialdirigent Dr. Volkmar Dietz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20240523/bf9729ba/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dbertini_flash4_id2069359-578838_5-23-70004-3217764188434307772_1.darshan.pdf
Type: application/pdf
Size: 84062 bytes
Desc: dbertini_flash4_id2069359-578838_5-23-70004-3217764188434307772_1.darshan.pdf
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20240523/bf9729ba/attachment-0001.pdf>


More information about the flash-users mailing list