[FLASH-USERS] Problem with mpirun with multiple core

Rahul Kashyap rkashyap at umassd.edu
Fri Oct 3 15:52:20 EDT 2014


Hi,

I'm having problem with running flash with multiple core(below are
details). If anyone had encountered this kind of problem before, any help
would be highly appreciated.

The underlying OS for the cluster is Red Hat Enterprise Linux Server
release 6.4 (Santiago).
I have built openmpi-1.8.3 and hdf5-1.8.12 using installed openmpi-1.8.3.
Compilers are gfortran & gcc.

It runs fine with  ./flash4  and also mpirun -np 1 ./flash4  BUT, gives the
error below with mpirun -np 2 ./flash4

Thanks a lot,
Rahul Kashyap
Physics, UMASS Dartmouth.

[ghpcc06:62116] *** Process received signal ***
[ghpcc06:62116] Signal: Segmentation fault (11)
[ghpcc06:62116] Signal code:  (128)
[ghpcc06:62116] Failing at address: (nil)
[ghpcc06:62116] [ 0] /lib64/libpthread.so.0[0x3608e0f500]
[ghpcc06:62116] [ 1]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_allocator_bucket.so(mca_allocator_bucket_alloc+0x4d)[0x7ff193398e1d]
[ghpcc06:62116] [ 2]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_pml_ob1.so(+0xab1b)[0x7ff19123ab1b]
[ghpcc06:62116] [ 3]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_pml_ob1.so(+0xb1a4)[0x7ff19123b1a4]
[ghpcc06:62116] [ 4]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_match+0xd3)[0x7ff19123b603]
[ghpcc06:62116] [ 5]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_btl_vader.so(+0x2c68)[0x7ff191872c68]
[ghpcc06:62116] [ 6]
/home/rk40d/sw/openmpi-1.8.3/lib/libopen-pal.so.6(opal_progress+0x4a)[0x7ff195c9869a]
[ghpcc06:62116] [ 7]
/home/rk40d/sw/openmpi-1.8.3/lib/libmpi.so.1(ompi_request_default_wait+0x5d)[0x7ff19651a0cd]
[ghpcc06:62116] [ 8]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_generic+0x4b7)[0x7ff1901a0517]
[ghpcc06:62116] [ 9]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_binomial+0xd8)[0x7ff1901a0728]
[ghpcc06:62116] [10]
/home/rk40d/sw/openmpi-1.8.3/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_bcast_intra_dec_fixed+0xcc)[0x7ff1901968cc]
[ghpcc06:62116] [11]
/home/rk40d/sw/openmpi-1.8.3/lib/libmpi.so.1(MPI_Bcast+0x130)[0x7ff19652c8a0]
[ghpcc06:62116] [12]
/home/rk40d/sw/openmpi-1.8.3/lib/libmpi_mpifh.so.2(pmpi_bcast+0x7e)[0x7ff1967ecd6e]
[ghpcc06:62116] [13] ./flash4[0x5c47b5]
[ghpcc06:62116] [14] ./flash4[0x43f331]
[ghpcc06:62116] [15] ./flash4[0x40cbc5]
[ghpcc06:62116] [16] ./flash4[0x412d71]
[ghpcc06:62116] [17] ./flash4[0x7fd03a]
[ghpcc06:62116] [18] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30fbe1ecdd]
[ghpcc06:62116] [19] ./flash4[0x405149]
[ghpcc06:62116] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 62116 on node ghpcc06 exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://flash.rochester.edu/pipermail/flash-users/attachments/20141003/e15f360e/attachment.htm>


More information about the flash-users mailing list