[FLASH-USERS] Fatal error in MPI_Sendrecv: Unknown error class, error stack
g.granda at irya.unam.mx
g.granda at irya.unam.mx
Sun Dec 23 20:24:15 EST 2018
Hello Flash users,
I've been running hydro-dynamical simulations using a uniform grid for a
while without any trouble. However, I recently got an error after
increasing the resolution. Before that I increased the resolution of
these simulations without any problem., which make me guess that this
issue could be related to lack of enough memory, but I'm not sure.
My log file shows the following:
FLASH log file: 12-23-2018 16:37:39.277 Run number: 1
==============================================================================
Number of MPI tasks: 1024
MPI version: 3
MPI subversion: 1
Dimensionality: 3
Max Number of Blocks/Proc: 1
Number x zones: 64
Number y zones: 128
Number z zones: 128
Setup stamp: Wed Dec 12 23:25:03 2018
Build stamp: Wed Dec 12 23:25:32 2018
System info: Linux mouruka.crya.privado 2.6.32-504.16.2.el6.x86_64
#1 SMP Wed Apr 22 06:48:29
Version: FLASH 4.5_release
Build directory: /home/guido/FLASH4.5/obj_lr
Setup syntax: /home/guido/FLASH4.5/bin/setup.py LinearRegime_mz
-auto -3d -objdir=obj_lr +ug -site=irya.guido -nxb=64 -nyb=128 -nzb=128
f compiler flags:
/home/guido/libraries/compiled_with_gcc-7.3.0/mpich-3.2.1/bin/mpif90 -g
-c -O2 -fdefault-real-8 -fdefault-double-8 -ffree-line-length-none
-Wuninitialized -ggdb -c -O2 -fdefault-real-8 -fdefault-double-8
-ffree-line-length-none -Wuninitialized -DMAXBLOCKS=1 -DNXB=64 -DNYB=128
-DNZB=128 -DN_DIM=3
c compiler flags:
/home/guido/libraries/compiled_with_gcc-7.3.0/mpich-3.2.1/bin/mpicc
-I/home/guido/libraries/compiled_with_gcc-7.3.0/hdf5-1.8.20/include
-DH5_USE_16_API -O2 -c -DMAXBLOCKS=1 -DNXB=64 -DNYB=128 -DNZB=128
-DN_DIM=3
==============================================================================
Comment: Linear Regime
==============================================================================
FLASH Units used:
Driver/DriverMain/Unsplit
Driver/localAPI
Grid/GridBoundaryConditions
Grid/GridMain/UG
Grid/localAPI
IO/IOMain/hdf5/serial/UG
IO/localAPI
PhysicalConstants/PhysicalConstantsMain
RuntimeParameters/RuntimeParametersMain
Simulation/SimulationMain/LinearRegime_mz
flashUtilities/contiguousConversion
flashUtilities/general
flashUtilities/interpolation/oneDim
flashUtilities/nameValueLL
flashUtilities/system/memoryUsage/legacy
monitors/Logfile/LogfileMain
monitors/Timers/TimersMain/MPINative
physics/Eos/EosMain/Gamma
physics/Eos/localAPI
physics/Hydro/HydroMain/unsplit/Hydro_Unsplit
physics/Hydro/localAPI
physics/sourceTerms/Cool/CoolMain/equilibrium_cooling
==============================================================================
RuntimeParameters:
==============================================================================
bndpriorityone = 1
bndprioritythree = 3
bndprioritytwo = 2
checkpointfileintervalstep = 100000000 [CHANGED]
checkpointfilenumber = 0
dr_abortpause = 2
dr_dtminbelowaction = 1
dr_numposdefvars = 4
drift_break_inst = 0
drift_trunc_mantissa = 2
drift_verbose_inst = 0
eos_entrelescalechoice = 6
eos_loglevel = 700
fileformatversion = 9
forcedplotfilenumber = 0
hy_3torder = -1
hydrocomputedtoption = -1
igridsize = 1
iprocs = 16 [CHANGED]
iguard = 4
irenorm = 0
jgridsize = 1
jprocs = 8 [CHANGED]
jguard = 4
kgridsize = 1
kprocs = 8 [CHANGED]
kguard = 4
memory_stat_freq = 100000
meshcopycount = 1
nbegin = 1
nblockx = 1
nblocky = 1
nblockz = 1
nend = 1000000 [CHANGED]
nsteptotalsts = 5
order = 2
outputsplitnum = 1
plotfileintervalstep = 1000000 [CHANGED]
plotfilenumber = 0
rolling_checkpoint = 10000
sim_nk = 2 [CHANGED]
sweeporder = 123
transorder = 1
wr_integrals_freq = 1
limitedslopebeta = 0.100E+01
t_cool_min = 0.100E+02 [CHANGED]
cfl = 0.500E+00 [CHANGED]
checkpointfileintervaltime = 0.343E+14 [CHANGED]
checkpointfileintervalz = 0.180+309
cvisc = 0.100E+00
dr_dtmincontinue = 0.000E+00
dr_posdefdtfactor = 0.100E+01
dr_tstepslowstartfactor = 0.100E+00
dtinit = 0.137E+13 [CHANGED]
dtmax = 0.137E+13 [CHANGED]
dtmin = 0.137E+11 [CHANGED]
eintswitch = 0.000E+00
eos_singlespeciesa = 0.100E+01
eos_singlespeciesz = 0.100E+01
gamma = 0.167E+01 [CHANGED]
hy_cflfallbackfactor = 0.900E+00
hy_fpresinmomflux = 0.100E+01
hybridorderkappa = 0.000E+00
mu_mol = 0.127E+01
nd_cool_max = 0.100E+21
nd_cool_min = 0.100E-01
nusts = 0.100E+00
plotfileintervaltime = 0.137E+13 [CHANGED]
plotfileintervalz = 0.180+309
radiusgp = 0.200E+01
rss_limit = -0.100E+01
sigmagp = 0.300E+01
sim_num_dens = 0.300E+01
sim_rho_amp = 0.100E-02
sim_temp = 0.730E+03
small = 0.100E-39 [CHANGED]
smalle = 0.100E-09
smallp = 0.100E-21 [CHANGED]
smallt = 0.100E+01 [CHANGED]
smallu = 0.100E-39 [CHANGED]
smallx = 0.100E-09
smlrho = 0.100E-29 [CHANGED]
tinitial = 0.000E+00
tiny = 0.100E-15
tmax = 0.137E+15 [CHANGED]
tstep_change_factor = 0.200E+01
wall_clock_checkpoint = 0.108E+05 [CHANGED]
wall_clock_time_limit = 0.605E+06
xmax = 0.154E+20 [CHANGED]
xmin = -0.154E+20 [CHANGED]
ymax = 0.154E+20 [CHANGED]
ymin = -0.154E+20 [CHANGED]
zfinal = 0.000E+00
zinitial = -0.100E+01
zmax = 0.154E+20 [CHANGED]
zmin = -0.154E+20 [CHANGED]
riemannsolver = HLLC
unitsystem = CGS [CHANGED]
basenm = lr_ [CHANGED]
dr_posdefvar_1 = none
dr_posdefvar_2 = none
dr_posdefvar_3 = none
dr_posdefvar_4 = none
entropyfixmethod = HARTENHYMAN
eosmode = dens_pres [CHANGED]
eosmodeinit = dens_ie
geometry = cartesian
grav_boundary_type = isolated
hy_eosmodegc = see eosMode
log_file = lr.log [CHANGED]
output_directory =
pc_unitsbase = CGS
plot_grid_var_1 = none
plot_grid_var_10 = none
plot_grid_var_11 = none
plot_grid_var_12 = none
plot_grid_var_2 = none
plot_grid_var_3 = none
plot_grid_var_4 = none
plot_grid_var_5 = none
plot_grid_var_6 = none
plot_grid_var_7 = none
plot_grid_var_8 = none
plot_grid_var_9 = none
plot_var_1 = dens [CHANGED]
plot_var_10 = none
plot_var_11 = none
plot_var_12 = none
plot_var_2 = pres [CHANGED]
plot_var_3 = temp [CHANGED]
plot_var_4 = eint [CHANGED]
plot_var_5 = velx [CHANGED]
plot_var_6 = vely [CHANGED]
plot_var_7 = velz [CHANGED]
plot_var_8 = none
plot_var_9 = none
prof_file = profile.dat
refine_var_thresh = none
run_comment = Linear Regime [CHANGED]
run_number = 1
slopelimiter = vanLeer
stats_file = flash.dat
wenomethod = WENO5
xl_boundary_type = periodic
xr_boundary_type = periodic
yl_boundary_type = periodic
yr_boundary_type = periodic
zl_boundary_type = periodic
zr_boundary_type = periodic
eosforriemann = F
addthermalflux = T
allowdtstsdominate = F
alwayscomputeuservars = T
alwaysrestrictcheckpoint = T
charlimiting = T
chkguardcellsinput = F
chkguardcellsoutput = F
compute_grid_size = T
conserveangmom = F
converttoconsvdformeshcalls = F
corners = F
dr_printtsteploc = T
dr_shortenlaststepbeforetmax = F
dr_useposdefcomputedt = F
drift_tuples = F
eachprocwritesownabortlog = F
eachprocwritessummary = F
entropy = F
flux_correct = F
geometryoverride = F
gr_bcenableapplymixedgds = T
hy_fallbacklowercfl = F
hy_fullspecmsfluxhandling = T
ignoreforcedplot = F
io_writemscalarintegrals = F
plotfilegridquantitydp = F
plotfilemetadatadp = F
reducegcellfills = F
restart = F
shockdetect = F
shocklowercfl = F
summaryoutputonly = F
threadblocklistbuild = F
threaddriverblocklist = F
threaddriverwithinblock = F
threadeoswithinblock = F
threadhydroblocklist = F
threadhydrowithinblock = F
threadraytracebuild = F
threadwithinblockbuild = F
typematchedxfer = T
unbiased_geometry = F
updatehydrofluxes = T
useburn = F
usecollectivehdf5 = T
useconductivity = F
usecool = F
usecosmology = F
usedeleptonize = F
usediffuse = F
usediffusecomputedtspecies = F
usediffusecomputedttherm = F
usediffusecomputedtvisc = F
usediffusecomputedtmagnetic = F
useenergydeposition = F
useflame = F
usegravity = F
useheat = F
useheatexchange = F
usehydro = T
useincompns = F
useionize = F
uselegacylabels = T
usemagneticresistivity = F
usemassdiffusivity = F
useopacity = F
useparticles = F
useplasmastate = F
usepolytrope = F
useprimordialchemistry = F
useprotonemission = F
useprotonimaging = F
useradtrans = F
useraytrace = F
usests = F
usestsfordiffusion = F
usestir = F
usethomsonscattering = F
usetreeray = F
useturb = T
useviscosity = F
usexrayimaging = F
use_3dfullctu = T
use_auxeinteqn = T
use_avisc = F
use_cma_advection = F
use_cma_flattening = F
use_flattening = F
use_gravhalfupdate = T
use_hybridorder = F
use_steepening = F
use_upwindtvd = F
writestatsummary = T
==============================================================================
Known units of measurement:
Unit CGS Value
Base Unit
1 cm 1.0000
cm
2 s 1.0000
s
3 g 1.0000
g
4 K 1.0000
K
5 esu 1.0000
esu
6 mol 1.0000
mol
7 m 100.00
cm
8 km 1.00000E+05
cm
9 pc 3.08568E+18
cm
10 kpc 3.08568E+21
cm
11 Mpc 3.08568E+24
cm
12 Gpc 3.08568E+27
cm
13 Rsun 6.96000E+10
cm
14 AU 1.49598E+13
cm
15 yr 3.15569E+07
s
16 Myr 3.15569E+13
s
17 Gyr 3.15569E+16
s
18 kg 1000.0
g
19 Msun 1.98892E+33
g
20 amu 1.66054E-24
g
21 eV 11605.
K
22 C 2.99792E+09
esu
23 LFLY 3.08568E+24
cm
24 TFLY 2.05759E+17
s
25 MFLY 9.88470E+45
g
26 clLength 3.08568E+24
cm
27 clTime 3.15569E+16
s
28 clMass 1.98892E+48
g
29 clTemp 1.16045E+07
K
-----------End of Units--------------------
Known physical constants:
Constant Name Constant Value cm s g K
esu mol
1 Newton 6.67408E-08 3.0 -2.0 -1.0 0.0
0.0 0.0
2 speed of light 2.99792E+10 1.0 -1.0 0.0 0.0
0.0 0.0
3 Planck 6.62607E-27 2.0 -1.0 1.0 0.0
0.0 0.0
4 electron charge 4.80320E-10 0.0 0.0 0.0 0.0
1.0 0.0
5 electron mass 9.10938E-28 0.0 0.0 1.0 0.0
0.0 0.0
6 proton mass 1.67262E-24 0.0 0.0 1.0 0.0
0.0 0.0
7 fine-structure 7.29735E-03 0.0 0.0 0.0 0.0
0.0 0.0
8 Avogadro 6.02214E+23 0.0 0.0 0.0 0.0
0.0 -1.0
9 Boltzmann 1.38065E-16 2.0 -2.0 1.0 -1.0
0.0 0.0
10 ideal gas constant 8.31446E+07 2.0 -2.0 1.0 -1.0
0.0 -1.0
11 Wien 0.28978 1.0 0.0 0.0 1.0
0.0 0.0
12 Stefan-Boltzmann 5.67037E-05 0.0 -3.0 1.0 -4.0
0.0 0.0
13 Radiation Constant 7.56572E-15 -1.0 -2.0 1.0 -4.0
0.0 0.0
14 pi 3.1416 0.0 0.0 0.0 0.0
0.0 0.0
15 e 2.7183 0.0 0.0 0.0 0.0
0.0 0.0
16 Euler 0.57722 0.0 0.0 0.0 0.0
0.0 0.0
==============================================================================
Multifluid database: not configured in
==============================================================================
[ 12-23-2018 16:37:39.287 ] [gr_initGeometry] checking BCs for idir: 1
[ 12-23-2018 16:37:39.288 ] [gr_initGeometry] checking BCs for idir: 2
[ 12-23-2018 16:37:39.289 ] [gr_initGeometry] checking BCs for idir: 3
While, the error file shows:
Fatal error in MPI_Sendrecv: Unknown error class, error stack:
MPI_Sendrecv(237)..........: MPI_Sendrecv(sbuf=0x184d22a0, scount=1,
dtype=USER<hvector>, dest=1, stag=1, rbuf=0x185982a0, rcount=1,
dtype=USER<hvector>, src=3, rtag=1, comm=0x84000007,
status=0x7fffd42b79b0) failed
MPID_nem_tcp_connpoll(1845): Communication error with rank 794:
Connection reset by peer
/var/spool/torque/mom_priv/jobs/2392.mouruka.crya.privado.SC: line 12:
/home/guido: is a directory
Fatal error in PMPI_Bcast: Unknown error class, error stack:
PMPI_Bcast(1600)........: MPI_Bcast(buf=0x7fff76136018, count=1,
MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1452)...:
MPIR_Bcast(1476)........:
MPIR_Bcast_intra(1249)..:
MPIR_SMP_Bcast(1081)....:
MPIR_Bcast_binomial(285):
MPIC_Send(300)..........:
MPID_Send(75)...........: Communication error with rank 8
MPIR_SMP_Bcast(1088)....:
MPIR_Bcast_binomial(310): Failure during collective
Fatal error in PMPI_Bcast: Unknown error class, error stack:
PMPI_Bcast(1600)........: MPI_Bcast(buf=0x7fff908136d8, count=1,
MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1452)...:
MPIR_Bcast(1476)........:
MPIR_Bcast_intra(1249)..:
MPIR_SMP_Bcast(1081)....:
MPIR_Bcast_binomial(310): Failure during collective
MPIR_SMP_Bcast(1088)....:
MPIR_Bcast_binomial(310): Failure during collective
Fatal error in PMPI_Bcast: Unknown error class, error stack:
PMPI_Bcast(1600)........: MPI_Bcast(buf=0x7fff1bc337d8, count=1,
MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1452)...:
MPIR_Bcast(1476)........:
MPIR_Bcast_intra(1249)..:
MPIR_SMP_Bcast(1088)....:
MPIR_Bcast_binomial(310): Failure during collective
Fatal error in PMPI_Bcast: Unknown error class, error stack:
PMPI_Bcast(1600)........: MPI_Bcast(buf=0x7fffa3446a98, count=1,
MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1452)...:
MPIR_Bcast(1476)........:
MPIR_Bcast_intra(1249)..:
MPIR_SMP_Bcast(1088)....:
MPIR_Bcast_binomial(310): Failure during collective
Fatal error in PMPI_Bcast: Unknown error class, error stack:
PMPI_Bcast(1600)........: MPI_Bcast(buf=0x7fffe4a93648, count=1,
MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1452)...:
MPIR_Bcast(1476)........:
MPIR_Bcast_intra(1249)..:
MPIR_SMP_Bcast(1088)....:
I didn't find a similar error in the flash forum. Do you know what is
going on?
Cheers,
47,79 15%
1,2 Top
More information about the flash-users
mailing list