<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Hi Ryan,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>I have over 15 TB of free space; so unfortunately (or maybe fortunately), I do not have a space problem. <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>If I restart from a check point, the code seems to crash only when it writes. This is highly suspicious. I will try your suggestions and get back to you. <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>-Chris <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Ryan Farber [mailto:rjfarber@umich.edu] <br><b>Sent:</b> Saturday, March 16, 2019 4:05 PM<br><b>To:</b> Plechaty, Christopher<br><b>Cc:</b> flash-users@flash.uchicago.edu<br><b>Subject:</b> Re: [FLASH-USERS] SIGSEGV 174 error in Flash 4.5<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>Hi Chris,<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I've seen plenty of segfaults though I haven't seen one happen during I/O before. Could you check if your cluster ran out of disk? (E.g., run "df -h") It is very odd if you were writing checkpoint files perfectly fine earlier and nothing changed and it suddenly stopped working. If you restart from the checkpoint before the issue does it crash at the same point or randomly later on? If it crashes at the same point, then you could use DDT (parallel debugging tool) or litter io_xfer_mesh_dataset.c with "printf" statements to localize the error (you might have to put "#include <stdio.h>" at the top of the file (that is, the version of the file you'd copy from object to your problem directory to preserve the original)).<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Best,<o:p></o:p></p></div><div><div><div><div><div><div><div><p class=MsoNormal>--------<o:p></o:p></p><div><p class=MsoNormal>Ryan<o:p></o:p></p></div></div></div></div></div></div></div><p class=MsoNormal><o:p> </o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>On Fri, Mar 15, 2019 at 9:17 AM Plechaty, Christopher <<a href="mailto:cplechaty@riversideresearch.org">cplechaty@riversideresearch.org</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>To all,<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I am experiencing a SIGSEGV 174 error that I cannot seem to figure it out. <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I have been running the lasslab example (in 3D) which has been modified for my purposes. The simulation runs great, and happily dumps restart and plot data files for a long time. However, after some time, FLASH decides to write a restart dump and suddenly crashes due to a SIGSEGV 174 error (placed below). <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>I am running:<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Flash 4.5<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Intel compliers and mpi (2018)<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>HDF 1.8.13<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Hypre 2.11.2<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>My cluster runs Centos 7. <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>The error is as follows:<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>forrtl: severe (174): SIGSEGV, segmentation fault occurred<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Image PC Routine Line Source <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000E4EA7D Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>libpthread-2.17.s 00002ACEDECD56D0 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>libmpifort.so.12. 00002ACEDDB60460 __I_MPI___intel_a Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>libmpi.so.12.0 00002ACEDDEFE836 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>libmpi.so.12 00002ACEDDF08184 ADIOI_GEN_WriteSt Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>libmpi.so.12.0 00002ACEDE326ABC Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>libmpi.so.12 00002ACEDE327B35 PMPI_File_write_a Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000098661C Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000981469 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000096F95A Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000972536 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000095036D Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 00000000009509B7 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000094D295 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000706157 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 00000000007152FD Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000714B10 io_xfer_mesh_data 362 io_xfer_mesh_data.F90<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000713FA1 io_writedata_ 341 io_writeData.F90<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000049A5B1 io_writecheckpoin 129 IO_writeCheckpoint.F90<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000049962F io_output_ 267 IO_output.F90<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000417F6F driver_evolveflas 423 Driver_evolveFlash.F90<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 000000000044D805 MAIN__ 51 Flash.F90<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000407E9E Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><a href="http://libc-2.17.so" target="_blank">libc-2.17.so</a> 00002ACEDF5E1445 __libc_start_main Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>flash4debug 0000000000407DA9 Unknown Unknown Unknown<o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Has anyone seen this type of error before? <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>-Chris <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div></div></blockquote></div></div></body></html>