[FLASH-USERS] Timing problem in LaserSlab example

Reem Alraddadi raba500 at york.ac.uk
Tue Aug 20 11:29:08 EDT 2013

Hi Klaus,

When I increased the number of process to 32 and decreased nend to 3000, it
works without any problem and the run successfully completed, reaching to
109ps. However, when I only changed Cu temperature from 25 eV to 50 eV, the
problem occurs again. Any ideas?


> On Fri, 16 Aug 2013, Reem Alraddadi wrote:
> > Hi all,
> >
> > I am running a Modified LaserSlab example for three materials with the
> > following setup line:
> >
> > Slab3 -2d +pm4dev -nxb=16 -nyb=16 +mtmmmt +mgd species=cham,targ,foil
> > mgd_meshgroups=6 -parfile=example.par +3t -maxblocks=1000
> >
> > the run goes OK at the first time with lrefine_max=6 but when I goes
> > with lrefine_max=9
> > I got the following warning message :
> >
> > Warning: The initial timestep is too large.
> >
> > initial timestep = 1.000000000000000E-014
> >
> > CFL timestep = 5.139420973957487E-014
> >
> > Resetting dtinit to TIMESTEP_SLOW_START_FACTOR*dtcfl.
> [...]
> > So I reduced the initial time step and made dtinit=dtmin=0.1e-14  and I
> > increased nend to 8000 as I am really interested in time when it reaches
> to
> > 50ps until 200ps. However, by doing that , the warning message does not
> > exist any more but ...
> Reem,
> You did not necessarily have to modify dtinit, since (as the Warning
> message tries to say) the initial dt was reset to 0.5139420973957487e-14
> automatically. (I believe TIMESTEP_SLOW_START_FACTOR is hardwired to 0.1.)
> It is good that you decreased dtmin. It may also be good that you
> decreased dtinit to below that automatic value.
> However, those things are probably unrelated to the ultimate failure of
> your runs.
> >   I found the run didn't complete and just reach when
> > t=5e-11 and n step was only 1809 . Also, I found an output file regard to
> > Hypre library which I didn't understand what does mean. I have attached
> > with this e-mail my flash.par file, out.log, lasslab.log and the output
> > messages regard to Hpyre file. I need my run reach to time from 50ps
> until
> > 200 ps. Could you help me with this, please?
> It seems you got to ~ 53 ps. Then the run(s) failed.
> You did not provide the contents of the err.err file, even though
> your output.log says (twice):
>    PS:
>    Read file <err.err> for stderr output of this job.
> Maybe there are some messages in that file explaining while the run(s)
> failed.
> You may have exceeded the allowed CPU time in your batch system, or may
> have run out of memory, or some similar reason.
> > PS:  the flash,par is as same as example.par. Also,I wonder if the
> problem
> > because I run the same problem twice. I found that by mistake I run the
> > same file twice. Is this the reason that the run didn't complete ?
> I don't know. But it seems you had two instances of the same simulation
> running in the same directory, overwriting each other's output files
> and both writing interspersed lines to the log file. It is unnecessarily
> difficult to analyze from this output what has actually happened.
> I would suggest you
>  * restart, making sure this time that only one instance of the code is
>    running.
>  * You may restart from a checkpoint instead of starting from step 1,
>    to save time (you seem to have dumped checkpoints frequently).
>  * If the run fails again, make sure you look at all available output
>    including any err.err file.
> > and what
> > does hypre message mean?
> Apparently the run (or at least one of them? - it is unclear whether
> the hypre stack traces were generated by one or by both runs) died while
> at least some processors where in a hypre function.  It is unclear
> to me whether those processors actually caused signals that resulted in
> aborting the runs.
> Btw. it isn't clear what version of FLASH you were using. I hope you are
> using 4.0.1, because it includes a hypre-related fix to 4.0.
> Klaus
