FLASH User's Guide
Version 1.0
October 1999

Index:
2  Algorithmic overview
         2.1  Hydrodynamics
         2.2  Adaptive mesh refinement
         2.3  Equation of state
         2.4  Thermonuclear reactions


2  Algorithmic overview

FLASH provides a general simulation framework capable of incorporating several different types of physics module (e.g., hydrodynamics on structured or unstructured meshes, N-body solvers, reactive source terms, etc.) and initial or boundary conditions. However, it has been built to solve a particular class of problems, namely reactive, compressible flows in astrophysical environments. Thus, FLASH 1.0 is distributed with a particular set of algorithms to handle compressible hydrodynamics on a block-structured adaptive mesh, together with an appropriate nuclear equation of state and nuclear reaction network. We treat each of these elements in turn.

2.1  Hydrodynamics

The hydrodynamic module in FLASH 1.0 is based on the PROMETHEUS code (Fryxell, Müller, and Arnett 1989). This code solves Euler's equations for compressible gas dynamics in one, two, or three spatial dimensions. These equations can be written in conservative form as

r
t
+ Ñ ·rv
=
0
(1)
rv
t
+ Ñ ·rv v +Ñ P
=
rg
(2)
rE
t
+Ñ ·( rE + P ) v
=
rv ·g
(3)
where r is the fluid density, v is the fluid velocity, P is the pressure, E is the sum of the internal energy e and kinetic energy per unit mass,
E = e+ 1
2
v2 ,
(4)
g is the acceleration due to gravity, and t is the time coordinate. The pressure is obtained from the energy and density using the equation of state. For the case of an ideal gas equation of state, the pressure is given by
P = (g- 1) re,
(5)
where g is the ratio of specific heats. More general equations of state are discussed below.

For reactive flows, a separate advection equation must be solved for each chemical or nuclear species:

rXl
t
+ Ñ ·rXl v = 0 ,
(6)
where Xl is the mass fraction of the lth species, with the constraint that ål Xl = 1. The quantity rXl represents the partial density of the lth fluid. The code does not explicitly track interfaces between the fluids, so a small amount of numerical mixing can be expected during the course of a calculation.

The equations are solved using a modified version of the Piecewise-Parabolic Method (PPM), which is described in detail in Woodward and Colella (1984) and Colella and Woodward (1984). PPM is a higher-order version of the method developed by Godunov (1959). Godunov's method uses a finite-volume spatial discretization of the Euler equations together with an explicit forward time difference. Time-advanced fluxes at cell boundaries are computed using the analytic solution to Riemann's shock tube problem at each boundary. Initial conditions for each Riemann problem are determined by assuming the nonadvanced solution to be piecewise constant in each cell. Using the Riemann solution has the effect of introducing explicit nonlinearity into the difference equations and permits the calculation of sharp shock fronts and contact discontinuities without introducing significant nonphysical oscillations into the flow. Since the value of each variable in each cell is assumed to be constant, Godunov's method is limited to first-order accuracy in both space and time.

PPM improves on Godunov's method by representing the flow variables with piecewise parabolic functions. It also uses a monotonicity constraint rather than artificial viscosity to control oscillations near discontinuities, a feature shared with the MUSCL scheme of van Leer (1979). Although this could lead to a method which is accurate to third order, PPM is formally accurate only to second order in both space and time, as a fully third-order scheme proved not to be cost-effective. Nevertheless, PPM is considerably more accurate and efficient than most formally second-order algorithms.

PPM is particularly well-suited to flows involving discontinuities, such as shocks and contact discontinuities. The method also performs extremely well for smooth flows, although other schemes which do not perform the extra work necessary for the treatment of discontinuities might be more efficient in these cases. The high resolution and accuracy of PPM are obtained by the explicit nonlinearity of the scheme and through the use of intelligent dissipation algorithms, such as monotonicity enforcement, contact steepening, and interpolant flattening. These algorithms are described in detail by Colella and Woodward (1984).

A complete description of PPM is beyond the scope of this user's guide. However, for comparison with other codes, we note that the implementation of PPM in FLASH 1.0 uses the direct Eulerian formulation of PPM and the technique for allowing nonideal equations of state described by Colella and Glaz (1985). For multidimensional problems, FLASH 1.0 uses second-order operator splitting (Strang 1968). FLASH also implements PPM on a block-structured adaptive mesh, which is described in the following section.

2.2  Adaptive mesh refinement

We have used a package known as PARAMESH (MacNeice et al. 1999) for the parallelization and adaptive mesh refinement (AMR) portion of FLASH. PARAMESH consists of a suite of subroutines which handle refinement/derefinement, distribution of work to processors, guard cell filling, and flux conservation. In this section we briefly describe this package and the ways in which it has been modified for use with FLASH.

PARAMESH uses a block-structured adaptive mesh refinement scheme similar to others in the literature (e.g., Parashar 1999; Berger & Oliger 1984; Berger & Colella 1989; DeZeeuw & Powell 1993) as well as to schemes which refine on an individual cell basis (Khokhlov 1997). In block-structured AMR, the fundamental data structure is a block of uniform cells arranged in a logically Cartesian fashion. Each cell can be specified using a block identifier (processor number and local block number) and a coordinate triple (i,j,k), where i = 1¼nxb, j = 1¼nyb, and k = 1¼nzb. The complete computational grid consists of a collection of blocks with different physical cell sizes, related to each other in a hierarchical fashion using a tree data structure. The blocks at the root of the tree have the largest cells, while their children have smaller cells and are said to be refined. Two rules govern the establishment of refined child blocks in PARAMESH. First, the cells of a refined child block must be one-half as large as those of its parent block. Second, a block's children must be nested; that is, the child blocks must fit within their parent block and cannot overlap one another, and the complete set of children of a block must fill its volume. Thus, in d dimensions a given block can have at most 2d children.

Each block contains nxb×nyb×nzb interior cells and a set of guard cells. The guard cells contain boundary information needed to update the interior cells. These can be obtained from physically neighboring blocks, externally specified boundary conditions, or both. The number of guard cells needed depends upon the interpolation scheme and differencing stencil used for the hydrodynamics algorithm; for the explicit PPM algorithm distributed with FLASH, four guard cells are needed in each direction. PARAMESH handles the filling of guard cells with information from other blocks or a user-specified external boundary routine. If a block's neighbor has the same level of refinement, PARAMESH fills its guard cells using a direct copy from the neighbor's interior cells. If the neighbor has a different level of refinement, the neighbor's interior cells are used to interpolate guard cell values for the block. If the block and its neighbor are stored in the memory of different processors, PARAMESH handles the appropriate parallel communication (blocks are not split between processors). PARAMESH supports only linear interpolation for guard cell filling at jumps at refinement, but it is easily extended to allow other interpolation schemes; for example, for FLASH we have added a quadratic interpolation scheme. Once each block's guard cells are filled, it can be updated independently of the other blocks.

PARAMESH also enforces flux conservation at jumps in refinement, as described by Berger and Colella (1989). At jumps in refinement, the fluxes of mass, momentum, energy, and nuclear abundances across boundary cell faces are averaged across the fine cells at that face and provided to the corresponding coarse face on the neighboring block. These averaged fluxes are then used to update the values of the coarse block's local flow variables.

Each processor decides when to refine or derefine its blocks by computing a user-defined error estimator for each block. Refinement involves creation of between one and 2d refined child blocks, while derefinement involves deletion of a block. As child blocks are created, they are temporarily placed at the end of the processor's block list. After the refinements and derefinements are complete, the blocks are redistributed among the processors using a work-weighted Morton space-filling curve in a manner similar to that described by Warren and Salmon (1987) for a parallel treecode. During the distribution step each block is assigned a work value (an estimate of the relative amount of time required to update the block). The Morton number of the block is then computed by interleaving the bits of its integer coordinates as described by Warren and Salmon (1987); this determines its location along the space-filling curve. The Morton numbers of all of the blocks in the problem are then sorted using a parallel bitonic sort. Finally, the list of all blocks is partitioned among the processors using the block weights, equalizing the estimated workload of each processor. During the sorting step, only Morton numbers and not block data are communicated between processors; blocks are only moved (if necessary) after the sort. Changes in refinement are typically performed every ten timesteps.

The refinement criterion used by PARAMESH is adapted from Löhner (1987). Löhner's error estimator was originally developed for finite element applications and has the advantage that it uses an entirely local calculation. Furthermore, the estimator is dimensionless and can be applied with complete generality to any of the field variables of the simulation or any combination of them (by default, PARAMESH uses the density and pressure). Löhner's estimator is a modified second derivative, normalized by the average of the gradient over one computational cell. In one dimension on a uniform mesh it is given by

Ei = \mid ui+1 - 2ui + ui-1 \mid
\mid ui+1 - ui \mid + \mid ui - ui-1 \mid + e[ \mid ui+1 \mid - 2 \mid ui \mid + \mid ui-1 \mid ]
 ,
(7)
where ui is the refinement test variable's value in the ith cell. The last term in the denominator of this expression acts as a filter, preventing refinement of small ripples. The constant e is given a value of 10-4. Although PPM is formally second-order and its leading error terms scale as the third derivative, we have found the second derivative criterion to be very good at detecting discontinuities in the flow variable u. When extending this criterion to multidimensions, all cross derivatives are computed, and the following generalization of the above expresion is used:
Ei = é
ê
ê
ê
ê
ê
ê
ê
ê
ë

å
k,l 
æ
ç
è
2 u
xk xl
ö
÷
ø

é
ê
ë
(\mid u
xk
\midi+1 + \mid u
xk
\midi)/Dxl + e\mid 2 u
xk xl
\mid ù
ú
û
2

 
ù
ú
ú
ú
ú
ú
ú
ú
ú
û
1/2





 
 ,
(8)
where the sums are carried out over the coordinate directions. If the maximum value of Ei in a block is larger than an adjustable constant, CTORE, the block is marked for refinement. If the maximum value of Ei is less than a second constant (CTODE), the block is marked for derefinement. PARAMESH uses the default values CTORE = 0.8 and CTODE = 0.2. For each block, the error estimator is calculated for all interior cells plus two layers of guard cells, helping to ensure that blocks refine in advance of discontinuities.

2.3  Equation of state

It is common to call an equation of state routine more than 109 times when calculating two- and three-dimensional hydrodynamic models of stellar phenomena. Thus, it is very desirable to have an equation of state that is as efficient as possible, yet accurately represents the relevant physics. While FLASH is easily capable of including any equation of state, here we discuss three equation of state (henceforth EOS) routines which are supplied with FLASH 1.0: the ideal-gas or gamma-law EOS, the Nadyozhin EOS, and the Helmholtz EOS.

Each EOS routine takes as input the temperature T, density r, mean number of nucleons per isotope [`A], and mean charge per isotope [`Z]. Each routine then returns as output the scalar pressure (Ptot, specific thermal energy etot, and entropy Stot. In addition to these quantities, the EOS routines return partial derivatives of the pressure, specific thermal energy, and entropy with respect to the density and temperature:

Ptot
T
ê
ê
ê


r 
 , Ptot
r
ê
ê
ê


T 
 , etot
T
ê
ê
ê


r 
 , etot
r
ê
ê
ê


T 
 , Stot
T
ê
ê
ê


r 
 , Stot
r
ê
ê
ê


T 
 .
(9)
Quantities such as the specific heats and the adiabatic indices can be determined once these partial derivatives are known. When solving the energy equation, the temperature usually is obtained by iteration, and the thermodynamic derivatives must be available in order to implement efficient iteration schemes. Confirming that an EOS routine numerically satisfies thermodynamic consistency also demands that the derivatives be available.

An EOS is thermodynamically consistent if its outputs satisfy the Maxwell relations:

P  
=
 r2   e
r
ê
ê
ê


T 
  +  T   P
T
ê
ê
ê


r 
(10)
e
T
ê
ê
ê


r 
 
=
  T   S
T
ê
ê
ê


r 
(11)
- S
r
ê
ê
ê


T 
 
=
  1
r2
  P
T
ê
ê
ê


r 
 .
(12)
Thermodynamic inconsistency can manifest itself in the unphysical buildup (or decay) of the entropy (or temperature) during numerical simulations of what should be an adiabatic flow. Entropy-sensitive models (e.g., core-collapse supernovae) may suffer inaccuracies if thermodynamic consistency is significantly violated over sufficiently long time scales. The equations of state supplied with FLASH satisfy the Maxwell relations to a good degree of precision (in some cases to the limits of 64-bit arithmetic).

The gamma-law EOS provided with FLASH represents a simple ideal gas with constant adiabatic index g:

P = Na  k
_
A
rTe = 1
g- 1
  P
r
S = P/r+ e
T
 ,
(13)
where Na is the Avogadro number, and k is the Boltzmann constant. Because it requires no iteration in order to obtain the temperature, this EOS is quite inexpensive to evaluate.

More complex equations of state are necessary for realistic models of X-ray bursts, Type Ia supernovae, classical novae, and other stellar phenomena, where the electrons and positrons may be relativistic and/or degenerate and where radiation contributes significantly to the thermodynamic state. The Nadyozhin and Helmholtz EOS routines address this need. The Nadyozhin EOS, summarized by Nadyozhin (1974) and explained in detail by Blinnikov et al. (1996), was found by Timmes and Arnett (1999) in a comparison of five different analytic EOS schemes to give the best tradeoff among accuracy, thermodynamic consistency, and speed. The Helmholtz EOS (Timmes & Swesty 1999) uses interpolation from a two-dimensional table of the Helmholtz free energy F(r,T), obtaining comparable accuracy, better thermodynamic consistency, and about twice the speed of the Nadyozhin EOS.

For stellar phenomena, the pressure, specific thermal energy, and entropy contain contributions from several components:

Ptot
=
Prad + Pion + Pele + Ppos
(14)
etot
=
erad + eion +eele + epos
(15)
Stot
=
Srad + Sion + Sele + Spos .
(16)
Here the subscripts ``rad,'' ``ion,'' ``ele,'' and ``pos'' represent radiation, nuclei, electrons, and positrons, respectively. The radiation portion is that of a blackbody in local thermodynamic equilibrium:
Prad = a T4
3
erad = 3 Prad
r
Srad = (P/r+ e)
T
(17)
where a is the Stefan-Boltzmann energy constant, and c is the speed of light. The ion contribution is that of an ideal gas with g = 5/3. The electrons and positrons are treated as a non-interacting Fermi gas; the number densities of free electrons Nele and positrons Npos are given by
Nele
=
8 pÖ2
h3
  me3  c3  b3/2   [ F1/2(h,b)  +  F3/2(h,b) ]
(18)
Npos
=
8 pÖ2
h3
  me3  c3  b3/2 [ F1/2 ( -h- 2/b, b)  +  b  F3/2 ( -h- 2 /b, b) ] ,
(19)
where h is the Planck constant, me is the electron rest mass, b = k T / (me c2) is the relativity parameter, h = m/ k T is the normalized chemical potential energy m for electrons, and Fk(h,b) is the Fermi-Dirac integral
Fk(h,b) = ¥
ó
õ
0 
  xk  (1 + 0.5  b x)1/2  dx
exp(x - h) + 1
 .
(20)
Because the electron rest mass is not included in the chemical potential, the positron chemical potential must have the form hpos = -h-2/b. For complete ionization, the number density of free electrons in the matter is
Nele,matter =
_
Z

_
A
 Na  r = _
Z
 
 Nion ,
(21)
and charge neutrality requires
Nele,matter = Nele - Npos .
(22)
Solving this equation with a standard one-dimensional root-finding algorithm determines h. Once h is known, the Fermi-Dirac integrals can be evaluated, giving the pressure, specific thermal energy, and entropy due to the free electrons and positrons.

The Nadyozhin EOS routine uses polynomial or rational functions to evaluate the thermodynamic quantities. The temperature-density plane is decomposed into 5 regions (see Figure 12 of Blinnikov et al. 1996), with different expansions or fitting functions applied to each region. The methods used in the 5 regions are (1) a perfect gas approximation with the first-order corrections for degeneracy, (2) expansions of the half-integer Fermi-Dirac functions, (3) Chandrasekhar's (1939) expansion for a degenerate gas, (4) relativistic asymptotics, and (5) Gaussian quadrature for the thermodynamic quantities. The perturbation expansions, asymptotic relations, and fitting functions for the 5 regions are combined, with extraordinary care given to making sure that transitions between the regions are continuous, smooth, and thermodynamically consistent. All of the partial derivatives are obtained analytically.

The electron-positron Helmholtz free energy table used by the Helmholtz EOS was generated by using the Timmes EOS routine (Timmes & Arnett 1999). The Fermi-Dirac integrals, along with their derivatives with respect to h and b, were calculated to at least 18 significant figures using the quadrature schemes of Aparicio (1998). Newton-Raphson iteration was used to solve for h to at least 15 significant figures. The thermodynamic derivatives were computed analytically. Searches through the free energy table are avoided by computing hash indices from the values of any given (T,r[`Z]/[`A]) pair. No computationally expensive divisions are required in interpolating from the table; all of them can be computed and stored the first time the EOS routine is called.

2.4  Thermonuclear reactions

Modelling thermonuclear flashes typically requires the energy generation rate due to nuclear burning over a large range of temperatures, densities and compositions. The average energy generated or lost over a period of time is found by integrating a system of ordinary differential equations (the nuclear reaction network) for the abundances of important nuclei and the total energy release. In some contexts, such as Type II supernova models, the abundances themselves are also of interest. In either case, the coefficients that appear in the equations typically are extremely sensitive to temperature. The resulting stiffness of the system of equations requires the use of an implicit time integration scheme.

Timmes (1999) has reviewed several methods for solving stiff nuclear reaction networks, providing the basis for the reaction network solver included with FLASH. The scaling properties and behavior of three semi-implicit time integration algorithms (a traditional first-order accurate Euler method, a fourth-order accurate Kaps-Rentrop method, and a variable order Bader-Deuflhard method) and eight linear algebra packages (LAPACK, LUDCMP, LEQS, GIFT, MA28, UMFPACK, and Y12M) were investigated by running each of these 24 combinations on seven different nuclear reaction networks (hard-wired 13- and 19-isotope networks and table-lookup networks using 47, 76, 127, 200, and 489 isotopes). Timmes' analysis suggested that the best balance of accuracy, overall efficiency, memory footprint, and ease-of-use was provided by the choice of a 13- or 19-isotope reaction network, the variable order Bader-Deuflhard time integration method, and the MA28 sparse matrix package. The default FLASH distribution uses this combination of methods, which we describe in this section.

We begin by describing the equations solved by the nuclear burning module. We consider material which may be described by a density r and a single temperature T and contains a number of isotopes i, each of which has Zi protons and Ai nucleons (protons + neutrons). Let ni and ri denote the number and mass density, respectively, of the ith isotope, and let Xi denote its mass fraction, so that

Xi = ri/r = ni Ai/(rNA) ,
(23)
where NA is the Avogadro number. Let the molar abundance of the ith isotope be
Yi = Xi/Ai = ni/(rNA) .
(24)
Mass conservation is then expressed by
N
å
i = 1 
Xi = 1
(25)
At the end of each timestep, FLASH checks that the stored abundances satisfy equation (25) to machine precision in order to avoid the unphysical buildup (or decay) of the abundances or energy generation rate. Roundoff errors in this equation can lead to significant problems in some contexts (e.g., classical nova envelopes) where trace abundances are important.

The general continuity equation for the ith isotope is given in Lagrangian formulation by

dYi
dt
+ Ñ·( Yi Vi ) = .
Ri
 
 .
(26)
In this equation [(Ri)\dot] is the total reaction rate due to all binary reactions of the form i(j,k)l,
.
Ri
 
=
å
j,k 
Yl Yk lkj(l) - Yi Yj ljk(i) ,
(27)
where lkj and ljk are the reverse (creation) and forward (destruction) nuclear reaction rates, respectively. Contributions from three-body reactions, such as the triple-a reaction, are easy to append to equation (27). The mass diffusion velocities Vi in equation (26) are obtained from the solution of a multicomponent diffusion equation (Chapman & Cowling 1970; Burgers 1969; Williams 1988) and reflect the fact that mass diffusion processes arise from pressure, temperature, and/or abundance gradients as well as external gravitational or electrical forces.

The case Vi º 0 is important for two reasons. First, mass diffusion is often unimportant when compared to other transport process such as thermal or viscous diffusion (i.e., large Lewis numbers and/or small Prandtl numbers). Such a situation obtains, for example, in the study of laminar flame fronts propagating through the quiescent interior of a white dwarf. Second, this case permits the decoupling of the reaction network solver from the hydrodynamical solver through the use of operator splitting, greatly simplifying the algorithm. This is the method used by the default FLASH distribution. Setting Vi º 0 transforms equation (26) into

dYi
dt
= .
Ri
 
 ,
(28)
which may be written in the more compact and standard form
.
y
 
= f  (y) .
(29)
Stated another way, in the absence of mass diffusion or advection, any changes to the fluid composition are due to local processes.

Because of the highly nonlinear temperature dependence of the nuclear reaction rates, and because the abundances themselves often range over several orders of magnitude in value, the values of the coefficients which appear in equations (28) and (29) can vary quite significantly. As a result, the nuclear reaction network equations are ``stiff.'' A system of equations is stiff when the ratio of the maximum to the minimum eigenvalue of the Jacobian matrix [(J)\tilde] º f/y is large and imaginary. This means that at least one of the isotopic abundances changes on a much shorter timescale than another. Implicit or semi-implicit time integration methods are generally necessary to avoid following this short-timescale behavior, requiring the calculation of the Jacobian matrix.

It is instructive at this point to look at an example of how equation (28) and the associated Jacobian matrix are formed. Consider the 12C(a,g)16O reaction, which competes with the triple-a reaction during helium burning in stars. The rate R at which this reaction proceeds is critical for evolutionary models of massive stars since it determines how much of the core is carbon and how much of the core is oxygen after the initial helium fuel is exhausted. This reaction sequence contributes to the right-hand side equation (29) through the terms

.
Y
 
(4He)
= - Y(4He)  Y(12C)  R + ¼
.
Y
 
(12C)
= - Y(4He)  Y(12C)  R  + ¼
 ,
(30)
.
Y
 
(16O)
= + Y(4He)  Y(12C)  R  + ¼
where the ellipsis indicate additional terms coming from other reaction sequences. The minus signs indicate that helium and carbon are being destroyed, while the plus sign indicates that oxygen is being created. Each of these three expressions contributes two terms to the Jacobian matrix [(J)\tilde]=f/y:
J(4He,4He) = - Y(12C)  R  + ¼
J(4He,12C) = - Y(4He)  R  + ¼
J(12C,4He) = - Y(12C)  R  + ¼
J(12C,12C) = - Y(4He)  R  + ¼
 .
(31)
J(16O,4He) = + Y(12C)  R  + ¼
J(16O,12C) = + Y(4He)  R  + ¼
Entries in the Jacobian matrix represent the flow, in number of nuclei s-1, into (positive) or out of (negative) an isotope. All of the temperature and density dependence is included in the reaction rate R. The Jacobian matrices that arise from nuclear reaction networks are neither positive-definite nor symmetric since the forward and reverse reaction rates are generally not equal. However, the magnitudes of the matrix entries change as the abundances, temperature, or density change with time.

The default FLASH distribution includes two reaction networks. The 13-isotope a-chain plus heavy-ion reaction network is suitable for most multi-dimensional simulations of stellar phenomena where having a reasonably accurate energy generation rate is of primary concern. The 19-isotope reaction network has the same a-chain and heavy-ion reactions as the 13-isotope network, but it includes additional isotopes to accommodate some types of hydrogen burning (PP and CNO chains), along with some aspects of photodisintegration into 54Fe. This reaction network is described in Weaver, Zimmerman, & Woosley (1978). Both the networks supplied with FLASH are examples of ``hard-wired'' reaction networks, where each of the reaction sequences are carefully entered by hand. This approach is suitable for small networks when minimizing the CPU time required to run the reaction network is a primary concern, although it suffers the disadvantage of inflexibility.

The Jacobian matrices of nuclear reaction networks tend to be sparse, and they become more sparse as the number of isotopes increases. Implicit time integration schemes require the inverse of the Jacobian matrix, so we need to use a sparse linear algebra package to compute this inverse. To control the iteration count for a prespecified accuracy, FLASH uses a direct sparse matrix solver. Direct methods typically divide the solution of [(A)\tilde] ·x = b into a symbolic LU decomposition, numerical LU decomposition, and a backsubstitution phase. In the symbolic LU decomposition phase the pivot order of a matrix is determined, and a sequence of decomposition operations which minimize the amount of fill-in is recorded. Fill-in refers to zero matrix elements which become nonzero (e.g., a sparse matrix times a sparse matrix is generally a denser matrix). The matrix is not (usually) decomposed; only the steps to do so are stored. Since the nonzero pattern of a chosen nuclear reaction network does not change, the symbolic LU decomposition is a one-time initialization cost for reaction networks. In the numerical LU decomposition phase, a matrix with the same pivot order and nonzero pattern as a previously factorized matrix is numerically decomposed into its lower-upper form. This phase must be done only once for each staged set of linear equations. In the backsubstitution phase, a set of linear equations is solved with the factors calculated from a previous numerical decomposition. The backsubstitution phase may be performed with as many right-hand sides as needed, and not all of the right-hand sides need to be known in advance.

The MA28 linear algebra package used by FLASH is described by Duff, Erisman, & Reid (1986). It uses a combination of nested dissection and frontal envelope decomposition to minimize fill-in during the factorization stage. An approximate degree update algorithm that is much faster (asymptotically and in practice) than computing the exact degrees is employed. One continuous real parameter sets the amount of searching done to locate the pivot element. When this parameter is set to zero, no searching is done and the diagonal element is the pivot, while when set to unity, complete partial pivoting is done. Since the matrices generated by reaction networks are usually diagonally dominant, the routine is set in FLASH to use the diagonal as the pivot element. Several test cases showed that using partial pivoting did not make a significant accuracy difference, but were less efficient since a search for an appropriate pivot element had to be performed. MA28 accepts the nonzero entries of the matrix in the (i, j, ai,j) coordinate system, and typically uses uses 70-90% less storage than storing the full dense matrix.

The time integration method used by FLASH for evolving the reaction networks is the variable order Bader-Deuflhard method (e.g., Bader & Deuflhard 1983; Press et al. 1992). The reaction network is advanced over a large time step H from yn to yn+1 by the following sequence of matrix equations. First,

h
= H/m
( ~
1
 
- ~
J
 
) ·D0
= h f(yn)
(32)
y1
= yn + D0 .
Then from k = 1,2,¼,m-1
( ~
1
 
- ~
J
 
) ·x
= h f(yk) - Dk-1
Dk
= Dk-1 + 2 x
(33)
yk+1
= yk + Dk  ,
and closure is obtained by the last stage
( ~
1
 
- ~
J
 
) ·Dm
= h [ f (ym) - Dm-1 ]
yn+1
= ym + Dm  .
(34)
This staged sequence of matrix equations is executed at least twice with m=2 and m=6, yielding a fifth-order method. The sequence may be executed a maximum of seven times, which yields a fifteenth-order method. The exact number of times the staged sequence is executed depends on the accuracy requirements (set to one part in 106 in FLASH) and the smoothness of the solution. Estimates of the accuracy of an integration step are made by comparing the solutions derived from different orders. The minimum cost of this method - which applies for a single time step that met or exceeded the specified integration accuracy - is one Jacobian evaluation, eight evaluations of the right-hand side, two matrix decompositions, and ten backsubstitutions. This minimum cost can be increased at a rate of one decomposition (the expensive part) and m backsubstitutions (the inexpensive part) for every increase in the order 2k+1. The cost of increasing the order is compensated for, hopefully, by taking a correspondingly larger (but accurate) time step. The controls for order versus step size are a built-in part of the Bader-Deuflhard method. The cost per step of this integration method is at least twice as large as the cost per step of either a traditional first-order accurate Euler method or a fourth-order accurate Kaps-Rentrop (essentially an implicit Runge-Kutta) method, However, if the Bader-Deuflhard method can take accurate time steps that are at least twice as large, then this method will be more efficient globally. Timmes (1999) shows that this is typically the case. Note that in equations (32) - (34) not all of the right-hand sides are known in advance since the sequence of linear equations is staged. This staging feature of the integration method will make some matrix packages, such as MA28, a more efficient choice.

The energy generation rate is given by the sum

.
e
 

nuc 
= NA  
å
i 
  dYi
dt
 .
(35)
Note that a nuclear reaction network does not need to be evolved in order to obtain the instantaneous energy generation rate, since only the right hand sides of the ordinary differential equations need to be evaluated. It is more appropriate in the FLASH program to use the average nuclear energy generated over a time step
.
e
 

nuc 
= NA  
å
i 
  DYi
Dt
 .
(36)
In this case, the nuclear reaction network does need to be evolved. The energy generation rate, after subtraction of any neutrino losses, is returned to the FLASH program for use with the operator splitting technique.

Finally, the tabulation of Caughlan & Fowler (1988) is used in FLASH for most of the key nuclear reaction rates. Modern values for some of the reaction rates were taken from the reaction rate library of Hoffman (1999, priv. comm.). Nuclear reaction rate screening effects as implemented by Wallace, Woosley, & Weaver (1982), and decreases in the energy generation rate [(e)\dot]nuc due to neutrino losses as given by Itoh et al. (1985) are included in calculations done with FLASH.

Go on to Section 3.