38.3 Typical workflow

Drift has two levels of output verbosity, let us refer to them as verbose and not verbose. When in non-verbose mode, drift will only generate output when directly told to do so through the Driver_driftUnk API call. This call tells drift to generate a checksum for each unk variable over all blocks in the domain and then log those checksums that have changed since the last call to Driver_driftUnk. Verbose mode also generates this information and additionally includes the per-block checksums for every call to Grid_releaseBlkPtr. Verbose mode can generate a lot of log data and so should only be activated when the simulation nears the point at which divergence originates. This is the reason for the drift_verbose_inst runtime parameter.

Drift internally maintains an "instance" counter that is incremented with every intercepted call to Grid_releaseBlkPtr. This is drift's way of enumerating the program states. When comparing two drift logs, if the first checksum discrepancy occurs at instance number 1349 (arbitrary), then it is clear that somewhere between the 1348'th and 1349'th call to Grid_releaseBlkPtr a divergent event occurred.

The suggested workflow once drift is enabled is to first run both simulations with verbose mode off (dirft_verbse_inst=0). The main Driver_evolveFlash implementations have calls to Driver_driftUnk between all calls to FLASH unit advancement routines. So the default behavior of drift will generate multiple unk-wide checksums for each variable per timestep. These two drift logs should be compared to find the first entry with a mismatched checksum. Each entry generated by Driver_driftUnk will contain an instance range like in the following:

 

step=1
from=Driver_evolveFlash.F90:276
unks inst=1234 to 2345
 dens 9CF3C169A5BB129C
 eint 9573173C3B51CD12
 ener 028A5D0DED1BC399
 ...

The line "unks inst=1349 to 2345" informs us these checksums were generated sometime after the 2345'th call to Grid_releaseBlkPtr. Assume this entry is the first such entry to not match checksums with its counterpart. Then we know that somewhere between instance 1234 and 2345 divergence began. So we set drift_verbose_inst = 1234 in the runtime parameters file of each simulation and then run them both again. Now drift will run up to instance 1234 as before, only printing at calls to Driver_driftUnk, but starting with instance 1234 each call to Grid_releaseBlkPtr will induce a per block checksum to be logged as well. Now these two drift files can be compared to find the first difference, and hopefully get you on your way to hunting down the cause of the bug.