[FLASH-USERS] Issues with MPI and meshreplication and rad-hydro problems

Nelson, Peter nelspete at oregonstate.edu
Mon Mar 9 14:33:01 EDT 2020


Hello Flash-Users,

I had some more time to review the issues I'm having.   It appears, I have
two separate issues.
1) My MPI isn't working right on one of my installs.  I have two
other installs that run MPI fine, so that's less of a concern at this
point.

2) meshCopyCount seems to multiply the internal energy initialized in the
Simulation_initBlock routine.  if I print the solnData at the end of
simulationInitBlock, everything looks fine, but when the simulation runs it
somehow ends up with the internal energy getting multiplied by
meshCopyCount.    Same issue on all three systems (with FLASH4.6.2
installs).  I checked multiple supplied test problems and they all present
the same problem.

Am I supposed to change something in the initialization if I use
meshCopyCount > 1?

Regards,

Pete Nelson

On Tue, Mar 3, 2020 at 4:49 PM Nelson, Peter <nelspete at oregonstate.edu>
wrote:

> Hello Flash-Users,
>
>  I am having some trouble understanding how to run MPI with some rad-hydro
> problems.  At first I thought it was just my setup, but I experience the
> same issue with the laserSlab problem.
>
>   I have a 1-d, spherical, 3T, MGD Rad-hydro, parameshAMR simulation
> (FLASH4.6.2).   I initialize with a high radiation internal energy over a
> few cells at the center of the sphere.  When I initialize with a
> meshcopycount value of 1, the values I set during simulation_initBlock are
> correct before the simulation begins to evolve.  If I set the meshcopycount
> to be greater than 1, the total internal energy that is written into the
> interior cells looks correct when I check them before and after the
> eos_wrapped call in simulation_initBlock, but when the simulation starts
> the integrated internal energy is a factor of meshcopycount greater than
> the value I am trying to set.  The integrated internal energy is seen in
> the .dat file, and in the affect on the simulation.
>
> Changing the domain size, nblockx, split/unsplit, refinement level (same
> issue in uniform grid) does not change the internal energy values.  Only
> the meshcopycount value seems to cause the issue.
>
> I followed the setup suggestions regarding MGD.  I calculate the radiation
> temperature, use that to set the group energy values via RadTrans_mgdEFromT
> then put the tradActual value into the grid cell.
>
> I see the same issue in the laserSlab and Shafronov example problems.   I
> tried the RadBlastWave problem as well and it has essentially the same
> problem E_total in the .dat doubles when meshcopycount = 2
>
> In all of the rad-hydro simulations the only way I can run multiple
> processors is by increasing the meshcopycount.  If I leave meshcopycount at
> 1 then I get an error such as (for laserSlab example with -np 2):
>  "Driver init all done
> Fatal error in MPI_Recv: Invalid rank, error stack:
> MPI_Recv(204): MPI_Recv(buf=0x2ced6e0, count=40, MPI_BYTE, src=40,
> tag=1000, comm=0x84000000, status=0x7fffccf359c0) failed
> MPI_Recv(117): Invalid rank has value 40 but must be nonnegative and less
> than 2
> Fatal error in MPI_Recv: Invalid rank, error stack:
> MPI_Recv(204): MPI_Recv(buf=0x1f97f50, count=40, MPI_BYTE, src=40,
> tag=1000, comm=0x84000000, status=0x7ffe3edb32d0) failed
> MPI_Recv(117): Invalid rank has value 40 but must be nonnegative and less
> than 2"
>
> Anyone have thoughts on what might be causing this and how to avoid it?
>
> I ran a Sedov problem (hydro only) and had no issues with varying the -np
> #.
>
>
> Regards,
>
> Pete Nelson
>
> Oregon State University
> PhD Candidate
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20200309/20e40ec3/attachment.htm>


More information about the flash-users mailing list