[FLASH-USERS] Unable to write unknowns

Paul M. Rich richp at flash.uchicago.edu
Fri Dec 18 16:55:16 EST 2009


Mateusz,

This is very strange.  Does this occur consistently on this particular 
plotfile during your run?  Does this occur only in a run started from 
scratch, or does it also occur if you have restarted?  Also, what sort 
of architecture are you running on, as well as proccount and the size of 
your problem?  I would expect to see this if FLASH tried to write and 
the disk was nearly full, if you, for some reason, ran out of memory at 
this point, had memory trampling elsewhere, or if there was indeed some 
problem with the system's software/hardware.  Without more information, 
though, this is hard to narrow down. 

Paul Rich
-----------------------------------------------
ASC FLASH Center
University of Chicago
richp at flash.uchicago.edu

Mateusz Ruszkowski wrote:
>
>
>   Hi all,
>
> The code is crashing during writing a plot file late in the evolution. 
> It is complaining about being "Unable to write unknowns". Prior to the 
> crashing point, the code produced many other files (even bigger ones) 
> . Is it possible that this is a hardware-related problem? I am 
> enclosing more information below. Has anybody seen this problem before?
>
>   thanks,
>    Mateusz
>
> ------------------------------------------------------------------
>
> in standard output I have:
> .
> .
> .
>  gr_hgSolve: ite 12: norm(residual)/norm(src) =  1.720294E-06
>  gr_hgSolve: ite 13: norm(residual)/norm(src) =  6.809702E-07
> Error: Unable to write unknowns
> Driver_abortC called
> Error: Unable to write unknowns
>
> Calling MPI_Abort for immediate shutdown
>
>
> in the log file I have:
>
> .
> .
> .
>  [ 12-18-2009  03:05:23.451 ] [gr_hgSolve]: gr_hgSolve: ite 13: 
> norm(residual)/norm(src) =  6.809702E-07
>  [ 12-18-2009  03:05:27.702 ] [IO_writePlotfile] open: type=plotfile 
> name=Run_hdf5_plt_cnt_0015
>
>
>
> in the error file I have:
>
> .
> .
> .
>
> HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.  
> Back trace follows.
>   #000: H5Dio.c line 587 in H5Dwrite(): can't write data
>     major(15): Dataset interface
>     minor(25): Write failed
> .
> .
> .
>
> MPI: --------stack traceback-------
> MPI: Using host libthread_db library "/lib64/libthread_db.so.1".
> MPI: Attaching to program: /proc/29261/exe, process 29261
> MPI: [Thread debugging using libthread_db enabled]
> MPI: [New Thread 46912529168064 (LWP 29261)]
> MPI: 0x00002aaaaad5ba85 in waitpid () from /lib64/libc.so.6
> MPI: (gdb) #0  0x00002aaaaad5ba85 in waitpid () from /lib64/libc.so.6
> MPI: #1  0x00002aaaab021baf in mpi_sgi_system (
> MPI: #2  0x00002aaaab022369 in MPI_SGI_stacktraceback (
> MPI:     header=0x7fffffffc8f0 "MPI: Global rank 0 is aborting with 
> error code 1.\n     Process ID: 29261, Host: r1i0n7, Program: 
> /nb/mr/flash3\n") at sig.c:353
> MPI: #3  0x00002aaaaaf8f439 in print_traceback (ecode=1) at abort.c:168
> MPI: #4  0x00002aaaaaf8f19c in PMPI_Abort (comm=1, errorcode=1) at 
> abort.c:59
> MPI: #5  0x0000000000420e30 in Driver_abortFlashC ()
> MPI: #6  0x00000000005b8633 in io_h5write_unknowns_sp_ ()
> MPI: #7  0x00000000005d1956 in io_writedata_ ()
> MPI: #8  0x000000000045e9f1 in io_writeplotfile_ ()
> MPI: #9  0x000000000045c06c in io_output_ ()
> MPI: #10 0x000000000042411c in driver_evolveflash_ ()
> MPI: #11 0x000000000042f396 in MAIN__ ()
> MPI: #12 0x0000000000418392 in main ()
> MPI: (gdb) The program is running.  Quit anyway (and detach it)? (y or 
> n) [answered Y; input not from terminal]
> MPI: Detaching from program: /proc/29261/exe, process 29261
>
> MPI: -----stack traceback ends-----
> MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
> MPI: aborting job




More information about the flash-users mailing list