<div dir="ltr"><div dir="ltr">Hi Gabriel,<div><br></div><div>I've seen stalls like this when my Makefile pointed to a different mpi library than I was using to execute the program. I'll echo Adam Reyes' suggestion to carefully check which mpi versions you're using throughout the build (especially if your "site" changed when you switched OS?) and execution. </div><div><br></div><div>Good luck!</div><div>Leland</div></div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><span style="color:rgb(34,34,34)">________________</span><div style="color:rgb(34,34,34)">Leland Ellison PhD<div>Pacific Fusion</div><div>Lead - Modeling and Simulations</div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Oct 15, 2024 at 5:01 AM Ryan Farber <<a href="mailto:rjfarber@umich.edu">rjfarber@umich.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Gabriel,<br><br>Unfortunately, I haven't had a "stalling" FLASH issue in quite some time. From what I recall (aligned with what Adam mentioned), I found my issue occurred during MPI communication. I think my fix was to switch MPI packages. So, you might want to try using a newer version of openmpi (in case openSUSE had a bug in 4.0.5 which got patched) or switch to mpich etc. If you really want to trace where the stall occurs, you can try using a parallel debugging tool to step through your code (or manually by adding write statements).<div><div><br></div><div>Best wishes,<br><div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr">--------<div>Ryan</div></div></div></div></div></div></div><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Oct 15, 2024 at 4:37 AM Gabriel Pérez Callejo <<a href="mailto:gabriel.perez.callejo@uva.es" target="_blank">gabriel.perez.callejo@uva.es</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  
  <div>
    <p>Dear both,</p>
    <p>Thanks for the help. I have added the DEBUG line to the setup and
      the "print" line to both Timers_start and Timers_stop. I am
      attaching the new log, stdout and stderr files (the STDOUT file is
      now significatly larger). I am still quite unsure what is making
      this stall... The stalling only happens for parallel runs indeed.</p>
    <p>Thanks again,<br>
    </p>
    <div><b>Gabriel Pérez-Callejo</b> <br>
      Profesor Ayudante Doctor (Assistant Professor) <br>
      Departamento de Física Teórica, Atómica y Óptica <br>
      Universidad de Valladolid <br>
      Valladolid, Spain <br>
      +34 983 18 6513 <br>
      <br>
      <br>
    </div>
    <div>El 15/10/24 a las 12:37, Reyes, Adam
      escribió:<br>
    </div>
    <blockquote type="cite">
      
      Hi Gabriel & Ryan,
      <div><br>
      </div>
      <div>Does the stalling happen only for parallel runs (mpirun -np
        > 1)? It seems likely that it could be stalling during some
        mpi communication. In my experience making sure all the
        dependencies are built consistently with the same compilers
        & mpi has helped.</div>
      <div><br>
      </div>
      <div>As for pinpointing where exactly the stall is happening there
        are a couple of things you can try:</div>
      <div><br>
      </div>
      <div>* Setup with "-defines=DEBUG ALL”, this will turn on a lot of
        debugging messages in all the FLASH units.</div>
      <div><br>
      </div>
      <div>* you can add a line like </div>
      <div>
        <blockquote type="cite">print *, “timers start”, name</blockquote>
      </div>
      <div> to “Timers_start/stop.F90</div>
      <div><br>
      </div>
      <div>Both of these should print plenty of messages to help narrow
        down where exactly the code is stalling.</div>
      <div>
        <ul>
          <span>
            <div><br>
            </div>
            <div><br>
            </div>
          </span><span></span>
        </ul>
      </div>
      <div>
        <div><span>*********************************************<br>
            Adam Reyes</span><br>
          <span><img alt="FLASH.jpg" src="cid:ii_1929006fb9561b414c31"></span>
          
          <br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Code Group Leader, Flash Center for </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Computational Science</span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Research Scientist, Dept. of Physics and </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Astronomy</span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">University of Rochester</span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">River Campus: Bausch and Lomb Hall, </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">369</span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">500 Wilson Blvd. PO Box 270171, </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Rochester, NY 14627</span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Email</span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"><a href="mailto:adam.reyes@rochester.edu" target="_blank">adam.reyes@rochester.edu</a></span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">Web</span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"><a href="https://flash.rochester.edu" target="_blank">https://flash.rochester.edu</a></span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline"> </span><span style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;float:none;display:inline">(he / him / his)</span><br style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">
          <span><img alt="FLASH-pride-sml.png" src="cid:ii_1929006fb95b8fa88eb2"></span>
          <div style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><br>
            *********************************************</div>
          <div style="color:rgb(0,0,0);font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><br>
          </div>
          <br>
        </div>
        <div><br>
          <blockquote type="cite">
            <div>On Oct 15, 2024, at 12:18 PM, Gabriel Pérez Callejo
              <a href="mailto:gabriel.perez.callejo@uva.es" target="_blank"><gabriel.perez.callejo@uva.es></a> wrote:</div>
            <br>
            <div>
              
              <div>
                <p>Hi Ryan,</p>
                <p>Thanks for the quick response. I am attaching to this
                  email the STDOUT, STDERR and log files.</p>
                <p>To answer your questions, the simulation does stall.
                  The ps command shows the parallel processes as active,
                  as well as the mpirun, but no progress is done,
                  nothing is printed in the log, STDOUT or STDERR files,
                  and if I run a *top* command, the machine is not
                  working on FLASH.</p>
                <p>I have retried including +noio and -debug in my setup
                  command, but it works identically, same problem.</p>
                <p>Best,<br>
                </p>
                <div><b>Gabriel Pérez-Callejo</b>
                  <br>
                  Profesor Ayudante Doctor (Assistant Professor) <br>
                  Departamento de Física Teórica, Atómica y Óptica <br>
                  Universidad de Valladolid <br>
                  Valladolid, Spain <br>
                  +34 983 18 6513 <br>
                  <br>
                  <br>
                </div>
                <div>El 15/10/24 a las 12:00,
                  Ryan Farber escribió:<br>
                </div>
                <blockquote type="cite">
                  <div dir="ltr">Hi Gabriel,
                    <div><br>
                    </div>
                    <div>I have encountered (and am to some extent still
                      trying to understand) a similar, possibly the
                      same, issue (also with FLASH 4.8). I think the
                      usual issue I encounter is caused due to running
                      out of memory, but it may also be related to
                      HDF5...</div>
                    <div><br>
                    </div>
                    <div>Regarding your issue, does the run just stall?
                      Such that ps aux | grep flash shows the process is
                      running but the simulation makes no progress in
                      outputting to your log file or STDOUT/STDERR
                      file(s)?<br>
                      <br>
                      Or does the run die? [Some error is encountered /
                      ps no longer shows the process or it's in a
                      completing, i/o, or zombie state.]<br>
                      <br>
                      It would be helpful if you can attach your log
                      file and your STDOUT/STDERR file(s). It would also
                      be useful if you try using +noio to determine if
                      you have an HDF5 issue, and -debug to provide a
                      traceback if an exception is raised.<br>
                      <br>
                      It's interesting this happened for you just
                      changing distributions. I'm hoping you
                      re-installed openmpi, hdf5, etc. on the new OS
                      rather than copying your installations from your
                      old OS(?)<br>
                      <br>
                      Best wishes,<br clear="all">
                      <div>
                        <div dir="ltr" class="gmail_signature">
                          <div dir="ltr">
                            <div>
                              <div dir="ltr">
                                <div dir="ltr">--------
                                  <div>Ryan</div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                      <br>
                    </div>
                  </div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Tue, Oct 15,
                      2024 at 2:51 AM Gabriel Pérez Callejo <<a href="mailto:gabriel.perez.callejo@uva.es" target="_blank">gabriel.perez.callejo@uva.es</a>>
                      wrote:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                      <div>
                        <p>Dear all,</p>
                        <p>I have been using FLASH for a while in Ubuntu
                          18, and am moving now to use the linux
                          distribution OpenSUSE. However, when running
                          flash in parallel mode, I am encountering the
                          following problem.</p>
                        <p>I am testing the LaserSlab example, with
                          FLASH4.6.2, using hdf5-1.10.7, hypre-2.11.2
                          and openmpi-4.0.5 (same as I used in Ubuntu
                          18). <br>
                        </p>
                        <p>I am launching the simulation by using <i>"./setup
                            -auto LaserSlab -2d +cylindrical -nxb=16
                            -nyb=16 +hdf5typeio species=cha</i><i><br>
                          </i><i>m,targ +mtmmmt +laser +uhd3t +mgd
                            mgd_meshgroups=6 -parfile=example.par" </i>then
                          moving to the <i>object</i> directory, using
                          <i>"make -j"</i> and after SUCCESS running "<i>mpirun
                            -np 3 flash4"</i>. <br>
                        </p>
                        <p>Now, this is what I used to do in Ubuntu, but
                          what I am finding in this case is that the
                          calculation is initialized, but after printing
                          <i>"Initial dt verified" </i>nothing else
                          happens. The code does not move forward. I can
                          see that the chk_0000 file has been generated,
                          but not the plt_0000.</p>
                        <p>Has anyone encountered this problem before?
                          Does anyone have any suggestions on how to fix
                          it?</p>
                        <p>Best,<br>
                        </p>
                        <div>-- <br>
                          <b>Gabriel Pérez-Callejo</b> <br>
                          Profesor Ayudante Doctor (Assistant Professor)
                          <br>
                          Departamento de Física Teórica, Atómica y
                          Óptica <br>
                          Universidad de Valladolid <br>
                          Valladolid, Spain <br>
                          +34 983 18 6513 <br>
                          <br>
                          <br>
                        </div>
                      </div>
                      _______________________________________________<br>
                      flash-users mailing list<br>
                      <a href="mailto:flash-users@flash.rochester.edu" target="_blank">flash-users@flash.rochester.edu</a><br>
                      <br>
                      For list info, including unsubscribe:<br>
                      <a href="https://flash.rochester.edu/mailman/listinfo/flash-users" rel="noreferrer" target="_blank">https://flash.rochester.edu/mailman/listinfo/flash-users</a><br>
                    </blockquote>
                  </div>
                </blockquote>
              </div>
              <span id="m_-5201361556784230196m_2347196953211066131cid:F1587759-672D-4571-B16D-58436AC90EA1"><lasslab.log></span><span id="m_-5201361556784230196m_2347196953211066131cid:822F86A6-25C0-4EC4-8773-CBFDBE295191"><STDERR></span><span id="m_-5201361556784230196m_2347196953211066131cid:2F8F9989-548F-4AA5-A5BE-BEFDC4643712"><STDOUT.txt></span>_______________________________________________<br>
              flash-users mailing list<br>
              <a href="mailto:flash-users@flash.rochester.edu" target="_blank">flash-users@flash.rochester.edu</a><br>
              <br>
              For list info, including unsubscribe:<br>
              <a href="https://flash.rochester.edu/mailman/listinfo/flash-users" target="_blank">https://flash.rochester.edu/mailman/listinfo/flash-users</a><br>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
  </div>
</blockquote></div>
_______________________________________________<br>
flash-users mailing list<br>
<a href="mailto:flash-users@flash.rochester.edu" target="_blank">flash-users@flash.rochester.edu</a><br>
<br>
For list info, including unsubscribe:<br>
<a href="https://flash.rochester.edu/mailman/listinfo/flash-users" rel="noreferrer" target="_blank">https://flash.rochester.edu/mailman/listinfo/flash-users</a><br>
</blockquote></div>