FLASH User's Guide
Version 2.0
January 2002
(ASC) FLASH Center
University of Chicago
Acknowledgments |
The FLASH Code Group is supported by the (ASC) FLASH Center at the
University of Chicago under U. S. Department of Energy contract B341495.
Some of the test calculations described here were performed using the
Origin 2000 computer at Argonne National Laboratory and the (ASC) Nirvana
computer at Los Alamos National Laboratory.
FLASH is a modular, adaptive-mesh, parallel simulation code capable of handling general compressible flow problems found in many astrophysical environments. FLASH is designed to allow users to configure initial and boundary conditions, change algorithms, and add new physics modules with minimal effort. It uses the PARAMESH library to manage a block-structured adaptive grid, placing resolution elements only where they are needed most. FLASH uses the Message-Passing Interface (MPI) library to achieve portability and scalability on a variety of different parallel computers.
The Center for Astrophysical Thermonuclear Flashes, or FLASH Center, was founded at the University of Chicago in 1997 under contract to the United States Department of Energy as part of its Accelerated Strategic Computing Initiative ((ASC)). The goal of the Center is to address several problems related to thermonuclear flashes on the surfaces of compact stars (neutron stars and white dwarfs), in particular X-ray bursts, Type Ia supernovae, and novae. To solve these problems requires the participants in the Center to develop new simulation tools capable of handling the extreme resolution and physical requirements imposed by conditions in these explosions, and to do so while making efficient use of the parallel supercomputers developed by the ASC project, the most powerful constructed to date.
The FLASH 2.0 code has been greatly expanded from the original FLASH 1.0 (Fryxell et al. 2000) and demonstrates significant progress toward the (ASC) FLASH Center's goal in the form of increased problem-solving capability, increased modularization, and further development of the code. FLASH 2.0 includes the following new features:
This user's guide is designed to enable individuals unfamiliar with the FLASH code to quickly get acquainted with its structure and to move beyond the simple test problems distributed with FLASH, customizing it to suit their own needs. Section 2 discusses how to get started quickly with FLASH, describing how to configure, build, and run the code with one of the included test problems, then examine the resulting output. Users familiar with the capabilities of FLASH who wish to quickly `get their feet wet' with the code should begin with this section.
Part II begins with an overview of the FLASH code architecture. It also includes a brief overview of the modules which are described individually and in more detail in Part III. Part II also explains how to use the driver-supplied and simulated services.
Part III describes in detail each of the modules included with the code, along with their submodules, runtime parameters, use with included solvers, and the equations and algorithms they use. Important note: We assume that the reader has some familiarity both with the basic physics involved and with numerical methods for solving partial differential equations. This familiarity is absolutely essential in using FLASH (or any other simulation code) to arrive at meaningful solutions to physical problems. The novice reader is directed to an introductory text, examples of which include
The advanced reader who wishes to know more specific information about a given module's algorithm is directed to the literature referenced in the algorithm section of the chapter in question.
Part IV describes the different test problems distributed with FLASH. Part V describes in more detail the use of the configuration and analysis tools distributed with FLASH. Finally, Part VI gives detailed instructions for extending FLASH's capabilities by adding new problem setups, describes how new solvers may be integrated into the code, and Part VII lists publication references.
This section describes how to quickly get up and running with FLASH, showing how to configure and build it to solve the Sedov explosion problem, how to run it, and how to examine the output using IDL.
You should verify that you have the following:
FLASH has been tested on the following Unix-based platforms. In addition, it may work with others not listed (see Section ).
Next, configure the FLASH source tree for the Sedov explosion problem. Type
./setup sedov -autoThis configures FLASH for the sedov problem, using the default hydrodynamic solver, equation of state, mesh package, and I/O format defined for this problem. For the purpose of this quick start example, we will use the default I/O format, HDF. The source tree is configured to create a two-dimensional code by default.
From the FLASH root directory (i.e. the directory from which you ran setup), execute gmake. This will compile the FLASH code. If you should have problems and need to recompile, `gmake clean' will remove all object files from the object/ directory, leaving the source configuration intact; `gmake realclean' will remove all files and links from object/. After `gmake realclean,' a new invocation of setup is required before the code can be built.
Assuming compilation and linking were successful, you should now find an executable named flashX in the object/ directory, where X is the major version number (e.g., 2 for X.Y = 2.0). You may wish to check that this is the case.
FLASH expects to find a flat text file named flash.par in the directory from which it is run. This file sets the values of various runtime parameters which determine the behavior of FLASH. If it is not present, FLASH will abort; flash.par must be created in order for the program to run. Here we will create a simple flash.par which sets a few parameters and allows the rest to take on default values. With your text editor, create flash.par in the main FLASH directory with the contents of Figure 1.
# runtime parameters |
This instructs FLASH to use up to four levels of adaptive mesh refinement (AMR) and to name the output files appropriately. We will not be starting from a checkpoint file (this is the default, but here it is explicitly set for purposes of illustration). Output files are to be written every 0.005 time units and will be created until t=0.02 or 1000 timesteps have been taken, whichever comes first. The ratio of specific heats for the gas (g) is taken to be 1.4, and all four boundaries of the two-dimensional grid have outflow (zero-gradient or Neumann) boundary conditions. Note the format of the file: each line is a comment (denoted by a hash mark, #), blank, or of the form variable = value. String values are enclosed in double quotes ("). Boolean values are indicated in the Fortran style, .true. or .false. Be sure to insert a carriage return after the last line of text. A full list of the parameters available for your current setup is contained in paramFile.html, which also includes brief comments for each parameter.
We are now ready to run FLASH. To run FLASH on N processors, type
mpirun -np N object/flashXremembering to replace N and X with the appropriate values. Some systems may require you to start MPI programs with a different command; use whichever command is appropriate to your system. The FLASH executable can take one command-line argument, the name of the runtime parameter file. The default parameter file name is flash.par. This is system-dependent and is not permitted by some machines (or MPI versions).
You should see a number of lines of output indicating that FLASH is initializing the Sedov problem, listing the initial parameters, and giving the timestep chosen at each step. After the run is finished, in the current directory you should find several files:
setenv XFLASH_DIR "$PWD/tools/idl"If you get a message indicating that IDL_PATH is not defined, enter
setenv IDL_PATH "${XFLASH_DIR}:$IDL_PATH"
setenv IDL_PATH "$XFLASH_DIR":idl-root-pathwhere idl-root-path points to the directory in which IDL is installed. Now run IDL (idl) and enter xflash at the IDL> prompt. You should see a control panel widget as shown in Figure 2. The path entry should be filled in for you with the current directory. Enter sedov_4_hdf_chk_ as the base filename and enter 4 as the suffix. (xflash can generate output for a number of consecutive files, but if you fill in only the beginning suffix, only one file is read.) Click the `Discover Variables' button to scan the file and generate the variable list. Choose the image format (screen, Postscript, GIF) and the problem type (in our case, Sedov). Selecting the problem type is only important for choosing default ranges for our plot; plots for other problems can be generated by ignoring this setting and overriding the default values for the data range and the coordinate ranges. Select the desired plotting variable and colormap. Under `Options,' select whether to plot the logarithm of the desired quantity, and select whether to plot the outlines of the AMR blocks. For very highly refined grids the block outlines can obscure the data, but they are useful for verifying that FLASH is putting resolution elements where they are needed. Finally, click `Velocity Options' to overlay the velocity field. The `xskip' and `yskip' parameters enable you plot only a fraction of the vectors so that they do not obscure the background plot.
When the control panel settings are to your satisfaction, click the `Plot' button to generate the plot. For Postscript and GIF output, a file is created in the current directory. The result should look something like Figure 2, although this figure was generated from a run with eight levels of refinement rather than the four used in the quick start example run. With fewer levels of refinement the Cartesian grid causes the explosion to appear somewhat diamond-shaped.
FLASH is intended to be customized by the user to work with interesting initial and boundary conditions. In the following sections we will cover in more detail the algorithms and structure of FLASH and the sample problems and tools distributed with it.
The FLASH source code is a collection of components called FLASH modules. FLASH modules can be combined in a variety of ways to form specific FLASH applications. Of course, not all available FLASH modules are necessarily used when solving any one particular problem. Thus, it is important to distinguish between the entire FLASH source code and a given FLASH application.
Most generally, a FLASH module represents some well-defined, top-level unit of functionality useful for a given class of problems. Its structure conforms to a small set of rules that facilitate its interactions with other modules in the creation of an application. Primary among these are the rules governing the retrieval and modification of data on the solution grid. A module must also announce a general set of requirements to the framework as well as publish a public interface of its services. Here we focus on the internal structure of a FLASH module appropriate for users wishing to extend the current FLASH functionality.
First, it is important to recall how a selected group of FLASH modules is combined to form a particular application. This process is carried out entirely by the FLASH setup tool, which uses configuration information provided by the modules and problem setup to properly parse the source tree and isolate the source files needed to carry set-up a specific problem. For performance reasons, setup ties modules together statically before the application is compiled.
Each FLASH module is divided into three principal components:
a) Configuration layer
b) ``Wrapper'' or ``interface'' layer
c) Algorithm
Additionally, a module may contain sub-modules which inherit from and override the functionality in the parent module. Each of these components is discussed in detail in the following sections.
Information about module dependencies, default sub-modules, runtime parameter definitions, library requirements, and so on is stored in plain text files named Config in the different module directories. These are parsed by setup when configuring the source tree and are used to create the code needed to register module variables, implement the runtime parameters, choose sub-modules when only a generic module has been specified, prevent mutually exclusive modules from being included together, and to flag problems when dependencies are not resolved by some included module. In the future they may contain additional information about module interrelationships.
3.1.1.1 Configuration file syntax
The syntax of the configuration files is as follows. Arbitrarily many spaces and/or tabs may be used, but all keywords must be in uppercase. Lines not matching an admissible pattern are ignored. (Someday soon they will generate a syntax error.)
| (1) |
| (2) |
#! SOME_PARAMETER The purpose of this parameter is whateverIf the parameter comment requires additional lines the & is used as:
#! SOME_PARAMETER The purpose of this parameter is whatever #! & This is a second line
Parameter comment lines are special because they are used by setup to build a formatted list of commented runtime parameters for a particular problem setup. This information is generated in the file paramFile.html in the $FLASH_HOME directory. A file paramFile.txt is also generated.
After the module Config and Makefile are written, the source code files that carry out the specific work of the modules must be added. These source files can be separated into two broad categories: what we term ``wrapper functions'' and ``algorithms.'' In this section we discuss how to construct wrapper functions.
When constructing a FLASH module the designer must define a public interface of procedures that the module exposes to clients (ie other modules in an application). This is true regardless of the specific development language or syntactic features chosen to organize the procedures. These public functions are then defined in one or more source code files that form what we refer to as the interface layer.
Currently, there is no language-level formality in FLASH for enforcing the distinction between the public interface and private module functions that the interface harnesses. The developer is certainly encouraged to implement this within the chosen development language - static functions in C; private class functions in C++; private subroutines in a Fortran module, etc. However, nothing in FLASH will force this distinction and carry out the associated name-hiding within an application.
The most important aspect of the interface/algorithm distinction is related to the rules for data access. Wrapper functions communicate directly with the FLASH database module to access grid data (see below). However, algorithms are not permitted to query the database module directly. Instead, they must receive all data via a formal function argument list. Thus, when a module A wishes to request the services of module B, A calls one of B's public wrapper functions. Rather than being required to pass all necessary data to B through a procedure argument list, B may ``pull'' data it needs access to from the Database, marshal it as necessary, call the module algorithm(s), receive the updated data, and update the database.
The following subsections describe the methods provided by the database module in more detail.
3.1.2.1 dBaseGetData/dBasePutData
Usage
call dBaseGetData([variable, [direction, [q1, [q2, [q3,]]]]] block, storage) call dBasePutData([variable, [direction, [q1, [q2, [q3,]]]]] block, storage)
Description
Data exchange with grid variables.
All variables registered with the framework
via the VARIABLE keywords within a module Config file can be read/written
with this
pair of functions. The data is assumed to be composed of one or more structured
blocks each with an integer block id = [1,num_blocks], where num_blocks is the
total number of blocks on a given processor.
Example
Given a Config file with the following variable registration specification:
VARIABLE densthe variable dens can be accessed from within FLASH as, for example:
real, dimension(nxb,nyb,nzb) :: density do this_block = 1, total_blocks call dBaseGetData("dens", this_block, density) call foo(density) end do
Arguments
character integer | variable | Variable name; if given as string it should match Config description; if not present all variables will be exchanged | ||||
character integer | direction | Specifies the shape of the data and how q1, q2, q3 should be interpreted; if not present whole variable will be exchanged | ||||
integer | q1, q2, q3 | Coordinated of data exchanged in order i-j-k; for vectors (q1,q2) = (i,j), (j,k), or (i,k) | ||||
integer | block | Integer block ID on a given processor specifying the patch of data to access | ||||
real |
| Allocated storage to receive result. Rank and shape of the storage array should match the rank and shape of data taken or put |
Strings
direction = | {``xyPlane¢¢ |
``xzPlane¢¢ |
``yzPlane¢¢ |
``yxPlane¢¢ |
``zxPlane¢¢ |
``zyPlane¢¢ |
``xVector¢¢ |
``yVector¢¢ |
``zVector¢¢ |
``Point¢¢ |
Integers
Integers for variables and directions are not publicly available,
but can be accessed through dBaseKeyNumber().
Note
The ``xyzCube'' keyword is obsolete.
When direction keyword is not specified, whole variable will be exchanged.
When, in addition, variable keyword is omitted all variables for the block
will be exchanged.
3.1.2.2 dBaseGetCoords/dBasePutCoords
Usage
call dBaseGetCoords(variable, direction, [q,] block, storage) call dBasePutCoords(variable, direction, [q,] block, storage)
Description
Access global coordinate information for a given block, including ghost points.
Example
real, DIMENSION(block_size) :: xCoords, data do i = 1, lnblocks call dBaseGetCoords("zn", "xCoord", i, xCoords) call dBaseGetData("dens", "xVector", 0, 0, i, data) call foo(data,xCoords) enddo
Arguments
character integer | variable | Specifies position of coord relative to block: left, right, or center (see below) | ||
character integer | direction | Specifies x-, y-, or z-coordinate | ||
integer | q | Allows to get/put a single point | ||
integer | block | Integer block ID on a given processor specifying the patch of data to access | ||
real |
| Allocated storage to receive result |
Strings
direction = | { ``xCoord¢¢ |
``yCoord¢¢ |
``zCoord¢¢ | coord position = | { ``znl¢¢: left block boundary |
``zn¢¢: block center |
``znr¢¢: right block boundary |
``znl0¢¢: ?? |
``znr0¢¢: ?? |
``ugrid¢¢: ?? |
Integers
Integers for variables and directions are not publicly available, but can
be accessed through dBaseKeyNumber().
3.1.2.3 dBaseGetBoundaryFluxes/dBasePutBoundaryFluxes
Usage
call dBaseGetBoundaryFluxes(position, direction, block, storage) call dBasePutBoundaryFluxes(position, direction, block, storage)
Description
Access boundary fluxes for all flux variables on a specified block associated
with a given time-level. Currently, only the current and previous time-step
are supported.
Example
real, dimension(nfluxes,ny,nz): xl_bound_fluxes, xr_bound_fluxes do this_block = 1, num_blocks call dBaseGetBoundaryFluxes(0,0,"xCoord", this_block, xl_bound_fluxes) call dBaseGetBoundaryFluxes(0,1,"xCoord", this_block, xr_bound_fluxes) call foo(xl_bound_fluxes, xr_bound_fluxes) call dBasePutBoundaryFluxes(0,0,"xCoord",this_block,xl_bound_fluxes) call dBasePutBoundaryFluxes(0,1,"xCoord",this_block,xl_bound_fluxes) end do
Arguments
integer | time_context | Specifies fluxes stored at current or previous time-step |
integer | position | Specifies left or right boundary in a given direction |
character | direction | Specifies x-, y-, or z-coordinate |
integer | block | Integer block ID on a given processor specifying the patch of data to access |
real | storage(:,:,:) | Return buffer of size nFluxes * dim1 * dim2 |
Strings
direction = | { ``xCoord¢¢ |
``yCoord¢¢ |
``zCoord¢¢ |
Integers
time_context = | { -1 (previous time-step) |
0 (current time-step) | position = | {0 (left) |
1 (right) |
Usage
result = dBaseKeyNumber(keyname)
Description
For faster performance, dBase{Get,Put}{Data,Coords}
can be called with integer arguments instead of strings. Each of the string
arguments accepted by the Get/Put methods can be replaced by a corresponding
integer. However, these integers are not publicly available. To obtain them,
one must call dBaseKeyNumber().
Example
integer :: idens, ixCoord idens = dBaseKeyNumber("dens") ixCoord = dBaseKeyNumber("xCoord") call dBasePutData(idens, ixCoord, block_no, data)
Arguments and return type
character | keyname | String with ``variable'' or ``direction'' name, as in get/put data/coords; names of variables must match Config description |
integer | dBaseKeyNumber | Integer assigned for the string key name |
Strings
keyname = | { ``xyzCube¢¢ |
``xyPlane¢¢ |
``xzPlane¢¢ |
``yzPlane¢¢ |
``yxPlane¢¢ |
``zxPlane¢¢ |
``zyPlane¢¢ |
``xVector¢¢ |
``yVector¢¢ |
``zVector¢¢ |
``Point¢¢ |
|
|
|
Usage
result = dBaseSpecies(index)
Description
Maps species number (from one to maximum number of species) to variable number
(actual index in ``unk'' array).
Arguments and return type
integer | index | Species number from one to maximum number of species |
integer | dBaseSpecies | Actual index in ``unk'' array |
At present, all of the species are stored with adjacent indices in the solution array, thus one can find the index of the first isotope with a call to dBaseSpecies(1), and increment this value by 1 to get the next species.
Usage
result = dBaseVarName(keynumber)
Description
Given a key number, return the associated variable name.
Example
integer :: idens char(len = 4) :: name idens = dBaseKeyNumber("dens") name = dBaseVarName(idens) ! name now = "dens"
Arguments and return type
integer | keynumber | Variable keynumber obtained with call to dBaseKeyNumber |
character(len = 4) | dBaseVarName | String name of variable as defined in Config; if variable does not exist returns ``null'' |
Usage
result = dBasePropertyInteger(property)
Description
Accessor methods for integer-valued scalar variables.
Arguments and return type
character | property | String with variable name, see below |
integer | dBasePropertyInteger | Property value |
Strings
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
| ||
|
|
Usage
result = dBasePropertyReal(property)
Description
Accessor methods for real-valued scalar variables.
Arguments and return type
character | property | String with variable name, see below |
integer | dBasePropertyReal | Property value |
Strings
|
| ||
|
|
Usage
call dBaseSetProperty(property, value)
Description
Mutator methods for writeable scalar variables.
See dBasePropertyInteger/Real for documentation on property names.
Arguments
character | property | String with variable name |
integer real | value | New property value |
Strings
Usage
result = dBaseVarIndex(property)
Description
Returns pointer to array of integer keys of variables with specified property.
Arguments and return type
character | property | String with property name, see below |
integer, POINTER, DIMENSION(:) | dBaseVarIndex | Pointer to array of unk indices of variables with given property |
Strings
3.1.2.11 Various pointer-returning functions
Each of these functions allows FLASH developers to hook directly into an internal data structure in the database. In general these functions will offer better performance then their corresponding dBaseGet/Put counterparts, and will require less memory overhead. However, the interfaces are more complicated and the functions are less flexible, and less safe, so it is suggested that developers strongly consider using dBaseGet/PutData when performance differences are small.
Each function returns a Fortran 90 pointer to the solution vector on the specified block. If no block is specified, a pointer is returned to all blocks on the calling processor. Currently the array index layout is assumed to be (var,nx,ny,nz,block) in row-major ordering. The scratch (unksm) array stores variables with no guardcells; this name should probably be changed in the future.
Argument and return type
integer | block |
real, DIMENSION( :,:,:,:,: ), POINTER | dBaseGetDataPtrAllBlocks |
real, DIMENSION( :,:,:,: ), POINTER | dBaseGetDataPtrSingleBlock |
real, DIMENSION( :,:,: ), POINTER | dBaseGetPtrToXCoords |
real, DIMENSION( :,:,: ), POINTER | dBaseGetPtrToYCoords |
real, DIMENSION( :,:,: ), POINTER | dBaseGetPtrToZCoords |
real, DIMENSION( :,:,:,:,: ), POINTER | dBaseGetScratchPtrAllBlocks |
real, DIMENSION( :,:,:,: ), POINTER | dBaseGetScratchPtrSingleBlock |
3.1.2.12 dBaseGetDataPtrAllBlocks()
Return an F90 pointer to the left-hand-side solution vector for all blocks on a
given processor, arranged in row-major order as: (var,nx,ny,nz,block).
dBaseKeyNumber must still be called to access the elements of the array.
3.1.2.13 dBaseGetDataPtrSingleBlock(block_no)
Return an F90 pointer to the left-hand-side solution vector on a specified
block, arranged as (var,nx,ny,nz).
3.1.2.14 dBaseGetPtrToXCoords()
Return an F90 pointer to an array containing information on the x-coordinates
of the AMR blocks. The array returned is arranged as
(block_position, i, block_number),
where block_position values denote center, left, or right coordinates,
and are obtained by calling dBaseKeyNumber
with ``zn'', ``znl'', ``znr'' and using the corresponding index to access the
approriate row in the array.
For example:
real, pointer, dimension(:,:,:) :: xCoords real :: x, xl, xr integer :: izn, iznl, iznr izn = dBaseKeyNumber("izn") iznl = dBaseKeyNumber("iznl") iznr = dBaseKeyNumber("iznr") xCoord => dBaseGetPtrToXCoords() do this_block = 1, num_blocks do i = 1, blocksize x = xCoord(izn, i, this_block) ! get first center coord xl = xCoord(iznl, i, this_block) ! get first left coord xr = xCoord(iznr, i, this_block) ! get first right coord ... enddo enddo
3.1.2.15 dBaseGetPtrToYCoords()
See dBaseGetPtrToXCoords().
3.1.2.16 dBaseGetPtrToZCoords()
See dBaseGetPtrToXCoords().
3.1.2.17 dBaseGetScratchPtrAllBlocks()
Return an F90 pointer to a scratch array of size
(2, nxb, nyb, nzb, maxblocks).
3.1.2.18 dBaseGetScratchPtrSingleBlock(block_no)
Return an F90 pointer to a scratch array of size (2, nxb, nyb, nzb).
3.1.2.19 AMR tree interface functions These functions enable FLASH developers to directly access the data structures used by PARAMESH to describe the adaptive mesh. In general they should not be needed by developers of physics modules. Also, they may not be available in future versions of FLASH.
Arguments and return types
integer | block |
real, DIMENSION (mfaces) | dBaseNeighborBlockList |
real, DIMENSION (mfaces) | dbaseNeighborBlockProcList |
real, DIMENSION (mchild) | dBaseChildBlockList |
real, DIMENSION (mchild) | dBaseChildBlockProcList |
real | dBaseParentBlockList |
real | dBaseParentBlockProcList |
integer | dBaseRefinementLevel |
integer, DIMENSION (nfaces) | dbaseNeighborType |
real, DIMENSION (mdim) | dBaseBlockCoord |
real, DIMENSION (mdim) | dBaseBlockSize |
integer | dBaseNodeType |
logical | dBaseRefine |
logical | dBaseDeRefine |
dBaseNeighborBlockList (block)
Given a block ID, return an array of block ID's which are the neighbors of the specified block. The returned array is of size max_faces = 6, but not all of the six elements will have meaningful values if the problem is run in fewer than three dimensions. Assuming the function is called as NEIGH = dBaseNeighborBlockList(), the ordering is as follows. The neighbor on the lower x face of block L is at NEIGH(1,L), the neighbor on the upper x face at NEIGH(2,L), the lower y face at NEIGH(3,L), the upper y face at NEIGH(4,L), the lower z face at NEIGH(5,L), and the upper z face at NEIGH(6,L). If any of these values are set to -1 or lower, there is no neighbor to this block at its refinement level. However there may be a neighbor to this block's parent. If the value is -20 or lower then this face represents an external boundary, and the user is required to apply some boundary condition on this face. The exact value below -20 can be used to distinguish between the different boundary conditions which the user may wish to implement.
dbaseNeighborBlockProcList (block)
Given a block ID, return an array of size max_faces = 6 elements containing processor ID's identifying the processor that a given neighbor resides on. Ordering is identical to dBaseNeighborBlockList().
dBaseChildBlockList (block)
Given a block ID, return an array of size max_child = 2 * max_dim elements containing the block ID's of the child blocks of the specified block. The children of a parent are numbered according to the Fortran array ordering convention, ie. child 1 is at the lower x, y, and z corner of the parent, child 2 at the higher x coordinate but lower y and z, child 3 at lower x, higher y and lower z, child 4 at higher x and y and lower z, and so on.
dBaseChildBlockProcList (block)
Given a block ID, return an array of size max_child elements containing processor ID's of the children of the pecified block. Ordering is identical to dBaseChildBlockList().
dBaseParentBlockList (block)
Given a block ID, return the ID of the block's parent block.
dBaseParentBlockProcList (block)
Given a block ID, return the processor ID upon which the block's parent resides.
dBaseRefinementLevel (block)
Given a block ID, return that block's integer level of refinement.
dBaseNodeType (block)
Given a block ID, return the block's node type. If 1 then the node is a leaf node, if 2 then the node is a parent but with at least 1 leaf child, otherwise it is set to 3 and it does not have any up-to-date data.
dbaseNeighborType (block)
Given a block ID, return an array of size (mfaces, maxblocks_tr), containing the type ID's of the neighbors of the specified block. mfaces = mdim * 2, where mdim is the maximum possible dimensionality (3).
dBaseBlockCoord (block)
Given a block ID, return an array of size ndim containing the x,y,z coordinates of the center of the block.
dBaseBlockSize (block)
Given a block ID, return an array of size ndim containing the block size in the x, y, and z directions.
dBaseRefine (block)
Given a block ID, return .true. if that block is set for refinement in the next call to amr_refine_derefine(), and .false. otherwise.
dBaseDeRefine (block)
Given a block ID, return .true. if that block is set for refinement in the next call to amr_refine_derefine(), and .false. otherwise.
Within each module are one or more procedures which perform the bulk of the computational work for the module. A principal strategy behind the FLASH architecture is to decouple these procedures as much as possible from the details of the framework in which they are embedded. This is accomplished by requiring that all module algorithms communicate data only through function argument lists. That is, algorithms may not query the database directly nor may they depend on the existence of externally defined or global variables. This design ensures that algorithms can be tested, developed, and interchanged in complete isolation from the larger, more complicated framework.
Thus, each algorithm in a module should have a well defined argument list. It is up to the algorithm developer to make this as general or restrictive as he/she sees fit. However, it is important to keep in mind that, the more rigid the argument list, the less chance that another algorithm can share its interface. The consequence is that the developer would have to add an entirely new wrapper function for just slightly different functionality.
An abstract representation of the FLASH architecture appears in Figure 3. Each box in this figure represents a component (FLASH module), which publishes a small set of public methods to its clients. These public methods are expressed through virtual function definitions (stubs under Fortran 90), which are implemented by real functions supplied by sub-modules. Typically each component represents a different class of solver. For time-dependent problems, the driver uses time-splitting techniques to compose the different solvers. The solvers are divided into different classes on the basis of their ability to be composed in this fashion and upon natural differences in solution method (e.g., hyperbolic solvers for hydrodynamics, elliptic solvers for radiation and gravity, ODE solvers for source etc.).
The adaptive mesh refinement module is treated in the same way as the solvers. The means by which the driver shares data with the solver objects is the primary way in which the architecture affects the overall performance of the code. Choices here, in order of decreasing performance and increasing flexibility, include global memory, argument-passing, and messaging. FLASH 2.0 has eliminated global variable access in favor of a well-defined set of accessor and mutator methods managed by the centralized database module. When done with an eye toward optimization, the effects on performance are tolerable, and the benefits for maintainability and extensibility are significant. This is discussed in greater detail below.
The structure of the FLASH source tree reflects the module structure of the code, as shown in Figure 4 for an older version of FLASH. The general plan is that source code is organized into one set of directories, while the code is built in a separate directory using links to the appropriate source files. The links are created by a source configuration script called setup, which makes the links using options selected by the user and then creates an appropriate makefile in the build directory. The user then builds the executable with a single invocation of gmake.
Source code for each of the different code modules is stored in subdirectories under source/. The code modules implement different physics, such as hydrodynamics, nuclear burning, and gravity, or different major program components, such as the main driver code and the input/output code. Each module directory contains source code files, makefile fragments indicating how to build the module, and a configuration file (see Section 6.1.2).
Each module subdirectory may also have additional sub-module directories underneath it. These contain code, makefiles, and configuration files specific to different variants of the module. For example, the hydro/ module directory can contain files which are generic to hydrodynamical solvers, while its explicit/ subdirectory contains files specific to explicit hydro schemes and its implicit/ subdirectory contains files specific to implicit solvers. Configuration files for other modules which need hydrodynamics can specify hydro as a requirement without mentioning a specific solver; the user can then choose one solver or the other when building the code (via the modules file; Section 6.1.3). When setup configures the source tree it treats each sub-module as inheriting all of the source code, configuration files, and makefiles in its parent module's directory, so generic code does not have to be duplicated. Sub-modules can themselves have sub-modules, so for example one might have hydro/explicit/split/ppm and hydro/implicit/ppm. Source files at a given level of the directory hierarchy override files with the same name at higher levels, whereas makefiles and configuration files are cumulative. This permits modules to supply stub routines that are treated as `virtual functions' to be overridden by specific sub-modules, and it permits sub-module directories to be self-contained.
When a module is not explicitly included by Modules, only one thing is done differently by setup: sub-modules are not included, except for a null sub-module, if it is present. Most top-level modules should contain only stub files to be overridden by sub-modules, so this behavior allows the module to be `not included' without extra machinery (such as the special stub files and makefiles required by earlier versions of FLASH). In those cases in which the module's files are not appropriate for the `not included' case, the null sub-module allows one to override them with appropriate versions.
New solvers and new physics can be added. At the current stage of development of FLASH it is probably best to consult the authors of FLASH (see Section 7) for assistance in this. Some general guidelines for adding solvers to FLASH 2.0 may be found in Section 17.
The setups/ directory has a structure similar to that of source/. In this case, however, each of the "modules" represents a different initial model or problem, and the problems are mutually exclusive; only one is included in the code at a time. Also, the problem directories have no equivalent of sub-modules. A makefile fragment specific to a problem need not be present, but if it is, it is called Makefile. Section 16 describes how to add new problem directories.
The setup script creates a directory called object/ in which the executable is built. In this directory setup creates links to all of the source code files found in the specified module and sub-module directories as well as the specified problem directory. (A source code file has the extension .c, .C, .f, .f90, .F90, .F, .fh, or .h.) Because the problem setup directory and the machine-dependent directory are scanned last, links to files in these directories override the ``defaults'' taken from the source/ tree. Hence special variants of routines needed for a particular problem can be used in place of the standard versions by simply giving the files containing them the same names.
Using information from the configuration files in the specified module and problem directories, setup creates a file named init_global_parms.F90 to parse the runtime parameter file and initialize the runtime parameter database. It also creates a file named rt_parms.txt, which concatenates all of the PARAMETER statements found in the appropriate configuration files and so can be used as a ``master list'' of all of the runtime parameters available to the executable.
setup also creates makefiles in object/ for each of the included modules. Each copy is named Makefile.module, where module is driver, hydro, gravity, and so forth. Each of these files is constructed by concatenating the makefiles found in each included module path. So, for example, including hydro/explicit/split/ppm causes Makefile.hydro to be generated from files named Makefile in hydro/, hydro/explicit/, hydro/explicit/split/, and hydro/explicit/split/ppm/. If the module is not explicitly included, then only hydro/Makefile is used, under the assumption that the subroutines at this level are to be used when the module is not included. The setup script creates a master makefile (Makefile) in object/ which includes all of the different modules' makefile fragments together with the site- or operating system-dependent Makefile.h.
The master Makefile created by setup creates a temporary subroutine, buildstamp.F90, which echoes the date, time, and location of the build to the FLASH log file when FLASH is run. To ensure that this subroutine is regenerated each time the executable is linked, the Makefile deletes buildstamp.F90 immediately after compiling it.
The setup script can be run with the -portable option to create a directory with real files which can be collected together with tar and moved elsewhere for building. In this case the build directory is assigned the name object_problem/. Further information on the options available with setup may be found in Section 13. .
Additional directories included with FLASH are tools/, which contains tools for working with FLASH and its output (Sec:FLASH output comparison utility), and docs/, which contains documentation for FLASH (including this user's guide) and the PARAMESH library.
The current FLASH distribution comes with a set of core components that form the backbone of many common problems, namely: database, driver, hydro, io, mesh, particles, source_terms, gravity, and materials. A detailed discussion of the role of each of these modules is presented in Part IV. Here we give a brief overview of each.
Section 4 describes in detail the various driver modules which may be implemented with FLASH. In addition to the default driver, which controls the initialization, evolution, and output of a FLASH simulation, four new driver modules have been written to implement different explicit time advancement algorithms. Three are written in the delta formulation: euler1, rk3, and strang_delta. The fourth, strang_state, is written in the state-vector formulation. A subsection concerning simulation services, runtime parameters, and logfiles is also included in this section.
Section 5 describe the FLASH I/O modules, which control how FLASH data structures are stored on different platforms and in different formats. Discussed in this section are the main I/O module hdf4, which uses the Hierarchical Data Format (HDF) for storing simulation data, two major HDF5 modules (serial and parallel versions), and Fortran 77 (f77) modules.
Section 6 describes the mesh module, together with the PARAMESH package of subroutines for the parallelization and adaptive mesh refinement (AMR) portion of FLASH.
Section 7 describes the two hydrodynamic modules included in FLASH 2.x. The first is based on the PROMETHEUS code (Fryxell, Müller, and Arnett 1989); the second is based on Kurganov numerical methods.
Section 8 describes the magnetohydrodynamics module included with the FLASH code, which solves the equations of ideal MHD.
Section 9 discusses the material properties module, which handles the tracking of multiple fluids in FLASH simulations. It includes the equation of state module, which implements the EOS for the hydrodynamical and nuclear burning solvers; the composition submodule, which sets up the different compositions needed by FLASH; and the stellar conductivity module, which may be used for computing the opacity of stellar material. Section describes source terms, including the nuclear burning module, which calculates the nuclear burning rate of a hydrodynamical simulation, and the stirring module, which adds a divergence-free, time-correlated `stirring' velocity at selected modes in a given hydrodynamical simulation.
Section 11 describes the gravity module, which computes gravitational potential or gravitational acceleration source terms for the code. It includes several sub-modules: the constant submodule, the plane parallel sub-module, the ptmass submodule, and the Poisson submodules.
The driver module controls the initialization, evolution, and output of a FLASH simulation. Initialization can be from scratch or from a stored checkpoint file. Evolution can use any of several different operator-splitting techniques to combine different physics operators and integrate them in time, or call a single operator if the problem of interest is not time-dependent. Output involves the production of checkpoint files, plot files, analysis data, and log file time stamps. In addition to these functions, the driver supplies important simulation services to the rest of the FLASH framework, including Fortran modules to handle runtime parameters, physical constants, memory usage reports, and log file management (these are discussed further in Section ).
The initialization and termination routines and simulation services modules are common to both time-dependent and time-independent drivers and thus are included at the highest level of driver. The file flash.F90 contains the main FLASH program (equivalent to main() in C) and calls these routines as needed. The default flash.F90 is empty and is intended to be overridden by submodules of driver. At this time only time-dependent drivers are supplied with FLASH; these are submodules of the driver/time_dep module. The time_dep version of flash.F90 calls the FLASH initialization routine, loops over timesteps, and then calls the FLASH termination routine. During the time loop it computes new timesteps, calls an evolution routine (evolve()), and calls output routines as necessary.
The details of each available time integration method are completely determined by the version of evolve() supplied by that method. The default time update method is to call each physics module's update routine for two equal timesteps - thus, hydro, source terms, gravity, hydro, source terms, gravity. The hydrodynamics update routines take a ``sweep order'' argument in case they are directionally split; in this case, the first call uses the ordering x-y-z, and the second call uses z-y-x. Each of the update routines is assumed to directly modify the solution variables. At the end of each pair of timesteps, the condition for updating the mesh refinement pattern is tested, and a refinement update is carried out if required.
The alternative ``delta formulation'' drivers (driver/time_dep/delta_form) modify a set of variables containing the change in the solution during the timestep. The change is only applied to the solution variables after all operators have been invoked. This technique permits more general time integration methods, such as Runge-Kutta methods, to be employed, and it provides a more flexible method for composing operators. However, only a few physics modules can make use of it as yet. More details on the delta formulation drivers appear in Section 4.1.
The driver module supplies certain runtime parameters regardless of which type of driver is chosen. These are described in Table 1.
New driver modules have been written to implement different explicit time advancement algorithms. This usage of the driver modules is slightly different than that of the default driver module, which does not directly implement a time advancement algorithm; the default driver and hydro modules each implement parts of the Strang splitting time advancement. First, a listing of the time advancement tasks common to all of the new drivers will be given. In following subsections, the details of each time advancement method will be described.
The three driver modules written in the delta formulation are euler1, rk3, and strang_delta. They make appropriate calls to the physics modules, and update the solution by calling functions provided by the formulation module. The strang_state driver is written in the state-vector formulation; it also calls the physics modules, but does not update the solution. To use the new modules, first choose the driver by including one of the following lines into the Modules file.
/driver/time_dep/delta_form/euler1The time advancement module determines which formulation module should be used; two instantiations are possible. For euler1, rk3, or strang_delta, specify
/driver/time_dep/delta_form/rk3
/driver/time_dep/delta_form/strang_delta
/driver/time_dep/delta_form/strang_state
/formulation/delta_formbut for strang_state specify
/formulationThe services provided by the formulation module for the delta formulation are a superset of those provided for the state-vector formulation, which explains the directory structure used. For both instantiations, the formulation module contains (i) subroutines for updating the conserved and auxiliary variables locally (on a block or face of a block), given a local Lphysics(U), and (ii) a parameter which declares which formulation is being used. For the delta formulation, the module also (iii) declares the global DU array and contains subroutines for accessing it, and (iv) provides a subroutine to update the variables globally.
For delta formulation time advancements, the new driver modules use the formulation module to hold and access the global DU array, and to update the solution. In the state-vector formulation, the formulation module is not directly used by the driver; instead, the physics modules call the update subroutines the formulation module provides.
The new driver modules discretize the left-hand side of
| (3) |
The distinction between V and W is that time-dependent differential equations are solved to determine the primary variables. The auxiliary variables are obtained from the primary variables through algebraic relations. Often the primary variables are the conserved variables, U, and in the rest of this section U will replace V. However, the time advancement algorithms implemented do not require this correspondence.
The time advancement algorithms are written generally, in that each differential equation is treated in the same way. The distinction between the equations (for example, between the x-momentum equation and the total energy equation) is expressed in the other physics modules. The time advancement algorithm does not need to know the identity of the variables on which it operates, except possibly to update the auxiliary variables from the primary variables; but this update is handled by a call to a subroutine provided by the formulation module.
The euler1 module implements the first-order, Euler explicit scheme:
| (4) |
At the beginning of a time step, DU is set to zero. Each of the physics modules is called, with Un as the initial state, and adds its contribution to DU. After all the physics modules have been called, the global DU array holds L(Un). Equation (4) yields Un+1. Finally, the auxiliary variables are updated from the conserved variables with a call to the global update subroutine provided by the formulation module.
Note that because all the physics modules start with the same initial state, the order in which the physics modules are called does not affect the results (except possibly through floating point roundoff differences when contributing to DU.)
The set of steps, consisting of calls to physics modules, updating the conserved variables, and updating the auxiliary variables, is often called a stage. The majority of the computational cost of a stage is the in the calls to the other physics modules; this component corresponds to a ``function evaluation'' for ordinary differential equation solvers. In the Euler explicit algorithm, there is one stage per time step.
Runge-Kutta schemes are a class ordinary differential equation solvers which are appreciated for their higher order of accuracy, ease of implementation, and relatively low storage requirements. There are many third-order Runge-Kutta methods; all require at least three stages. Most require at least three storage locations per primary variable, but the one implemented in the delta formulation in FLASH, derived by Williamson (J. Comp. Phys. 35:48, 1980) requires only two.
The two storage registers will be referred to as U and DU.
The global solution vector, U, holds Un at the beginning
of the time step, then intermediate solutions U(·) at the
end of each stage.
The manipulation of the global DU array is more complicated.
DU accumulates contributions from the
physics modules during a stage, but it also holds results from
previous stages; it is important to distinguish between the
results of the physics modules, L(U), and the quantity held in the
global DU array.
First the algorithm will be shown; then the usage of the
global DU array will be discussed.
For the equation
| (5) |
|
The algorithm is implemented using the following steps to attain the low storage. At the beginning of the time step, DU is set to zero. During the first stage each physics module contributes to DU, so after all have contributed, DU holds the bracketed term in eq. (6), L(Un). U(1) is then computed using eq. (6). Stage 1 is completed by multiplying DU by -5/9, which is required for the following stages. The process is repeated for stage 2: after the physics modules have contributed, DU holds the bracketed term in eq. (7); U(2) is computed by eq. (7) and stored in U; then DU is multiplied by -153/128. Stage 3 is similar, but ends after U(3) = Un+1 is computed and stored. It is critical that the only changes made to the DU array are those just listed; no physics module should change the value of DU, except to add its contribution, and since DU holds information from previous stages, it should not be reset to zero except at the beginning of the time step.
No runtime parameters are defined for this module.
The second-order accurate splitting method (Strang 1968) is attractive because of its low memory requirements. The algorithm is based on the operator splitting approach, in which a set of subproblems is solved rather a single complicated problem. Each subproblem typically accounts for one term in a system of partial differential equations, representing a particular type of physics and for which an appropriate (specialized) numerical method is available. The basic operator splitting method is first-order accurate, but the Strang splitting scheme is second-order accurate over two time steps. In the first time step, the subproblems are solved in a given sequence. Second-order accuracy is obtained by reversing the sequence in the second time step.
A key feature of the operator splitting approach is that the output of one subproblem is the input to the next subproblem. This allows an implementation that, globally, stores only the current solution, but can also cause problems including accuracy losses due to decoupling various physical effects (splitting errors) and difficulties implementing boundary conditions.
In practice it has been found that splitting errors are reduced when the subproblems are ordered in increasing stiffness, i.e. the stiffest subproblem is solved last in the sequence; this has recently been supported by numerical analysis (Sportisse 2000).
Two new driver modules implement an algorithm similar to the Strang splitting time advancement. Since the sequence is not exactly reversed in the second step compared to the first, the algorithm is not the true Strang splitting. However, the source terms include nuclear burning source terms which are very stiff, and there are sound arguments for computing them last. The strang_state module implements the algorithm in the state-vector formulation, and is recommended for ``production'' runs for its low memory requirements. The strang_delta driver is implemented in the delta formulation and is provided for testing and comparison. For both versions, one call to evolve (which implements the time advancement algorithm) advances the solution from tn to tn+2, i.e. over two time steps.
In the strang_state driver, the sequence of calls to physics modules in the first time step is
hydro(x-sweep)In the second time step, only the order of the hydro calls is reversed:
hydro(y-sweep)
hydro(z-sweep)
gravity
source terms
hydro(z-sweep)Mesh refinement and derefinement are executed only after the second step, not between the two steps; also the time step is held constant for the two steps. The y- and z-sweeps of hydro are not called unless that dimension is included in the simulation. The same algorithm is used in the strang_delta module, but after each call to a physics module, a call to a subroutine is necessary to update the solution. When the strang_state driver is used, these calls are made by each physics module.
hydro(y-sweep)
hydro(x-sweep)
gravity
source terms
No runtime parameters are defined for either module.
4.1.3.1 New formulation modules
The purposes of this module class are
to update the solution locally (on a block or a face of a block) or globally (on all blocks.)
The services provided to delta formulation drivers are a superset of those provided to drivers in the state-vector formulation, and the directory structure is used to express that. The /formulation directory contains the local update subroutines and a version of formulation_Module suitable for the state-vector instantiation. formulation_Module defines a module in the Fortran90 sense, as opposed to the FLASH hierarchy sense. The /formulation/delta_form directory contains the global update subroutine and the version of formulation_Module required for the delta instantiation.
When /formulation is specified in the Modules file, the local update functions and the first formulation_Module are built into the executable, as appropriate for drivers in the state-vector formulation; when /formulation/delta_form is specified in the Modules file, the local update functions, the global update function, and the second version of formulation_Module are used in the executable as required by drivers in the delta formulation. This use of the FLASH code framework and directory hierarchy allows static allocation of the global DU array when needed but saves that memory when not. At the same time it allows local update functions to be used by both state-vector and delta formulations without duplicating code.
Currently the update functions apply only to the particular variable sets described. The local update functions must be given the (old) conserved variables in the order r1, ¼,rionmax, ru, rv, rw, rE, and they store in the database X1, ¼, Xionmax, r, P, T, g, u, v, w, and E. The mapping from the conserved variables to the database variables is not general, it is specific to the variables just listed. Variables other than those specifically listed will not be updated, and their influence on the variables just listed will be ignored. Development of more flexible update routines is underway. However, changes will most likely be internal to the local and global update functions, and the organization of these modules is not expected to change.
4.1.3.1.1 State-Vector Instantiation
In this subsection the local update functions, named du_update_block, du_update_xface, du_update_yface, and du_update_zface, are described. These subroutines accept local arrays of conserved variables and their changes as inputs, compute updated conserved variables, compute auxiliary variables from algebraic relations (with the aid of appropriate equation of state calls), and store the updated variables in the database.
These subroutines accept the block number, a local DU, local
conserved variables U, the time step Dt, and a scalar factor
c, all as passed arguments. The face update routines also accept an
index specifying which grid plane to update. The conserved variables
are those listed in eq.(). The conserved variables are updated by
| (9) |
From the updated conserved variables, all variables stored in the database are computed. The density, r, and species mass fractions, Xs, are obtained from the species densities, rs. The velocity components u, v, and w and the total energy per unit volume E are computed from the momenta and total energy per unit mass, respectively, by dividing by r. The internal energy, ei, is calculated by subtracting the kinetic energy, (u2 + v2 +w2)/2, from E. The temperature, T, pressure, P, and ratio of specific heats, g are obtained through a call to the equation of state, for which r, Xs, and ei are inputs.
Finally, the updated variables are stored in the variable database. The variables stored are Xs, r, P, T, g, u, v, w, and E. Only the interior cells of a block or face are updated; for all guard cells, zeros are stored for all updated variables. None of the calculations described above are executed for the guard cells.
For the state-vector formulation, there are only a few tasks for the Fortran 90 module formulation_Module. First, it defines a Fortran logical parameter delta_formulation to be `false'. This parameter is designed to be accessed by physics modules. When false, it indicates that each physics module should update the solution; while the local update routines just described are recommended for this purpose, there is no requirement that they be used. Second, formulation_Module defines several parameters for sizing arrays and a set of integers (indices) used to access the variable database; these are used by the local update subroutines.
In the state-vector instantiation, formulation_Module does not declare the global DU array. It does define some functions which are used to access that array, but in this instantiation they do not perform any operations - they are `stub' functions. The reason for defining them is as follows. If a physics module is written so that either the state-vector or delta formulation can be used, it must include calls to functions which access the global DU array. When the state-vector formulation is used these calls are not made, but some compilers might raise errors if these functions were not defined. By defining them in this instantiation of formulation_Module, such errors are avoided. The stub functions are contained, in the Fortran 90 sense, in formulation_Module. The local update functions are not contained in the formulation_Module, although they directly access the array sizing parameters and database indices therein.
In this section the global update subroutine du_update and the delta formulation version of formulation_Module are described. The global update routine is a wrapper to the local update subroutine du_update_block. Two arguments, c and Dt, are passed into du_update. For each block, it gets r, Xs, u, v, w, and E from the database; computes the (old) conserved variables from these; gets the DU for the block from the global DU array; and calls du_update_block. Recall that du_update_block computes the updated variables and stores them in the database.
For the delta instantiation, formulation_Module defines the same array-sizing parameters and database indices as in the state-vector instantiation. However, it defines the parameter delta_formulation to be `true', indicating to the physics modules that their contributions should be added to the global DU array. The delta instantiation of formulation_Module statically allocates the global DU array, and defines several functions to manipulate it. Each element of the global DU array is set to zero by du_zero. A physics module can add its local DU for a block to the global array by calling du_block_to_global; the subroutines du_xface_to_global, du_yface_to_global and du_zface_to_global do the same for faces (slices) of a block. These subroutines are contained in formulation_Module, and are the actual, working versions of the stub functions defined in the state-vector instantiation.
The global DU array is a public, module-scope variable in the Fortran 90 sense. The du_update subroutine is not contained in formulation_Module, but can access the array-sizing parameters and database indices in the module. It can also access the global DU array directly, and is the only subroutine not contained in formulation_Module allowed to do so.
The driver module provides a Fortran 90 module called runtime_parameters. The routines in this module maintain `parameter contexts,' essentially small databases of runtime parameters. Contexts can be created and destroyed, and runtime parameters can be added to them, have their values modified, and be queried as to their value or data type. These features allow a program to maintain several contexts for different code modules without having to declare and share the parameters explicitly. User-written subroutines (e.g., for initialization) should use the routines in this module to access the values of any runtime parameters they require.
An example of the application of this module is to use the read_parameters() routine (separately supplied) to parse a text-format input file containing parameter settings. The calling program declares a context, adds parameters to it, then calls read_parameters() to parse the input file. Finally, the context is queried to obtain the input values. Such a program might look like the following code fragment.
Parameter names supplied as arguments to the routines are stored or compared in a case-insensitive fashion. Thus N_x and n_x refer to the same parameter, andprogram test use runtime_parameters type (parm_context_type) :: context real :: x_init ... call create_parm_context (context) call add_parm_to_context (context, "x_init", 4.) ... call read_parameters ("input.par", context) call get_parm_from_context (context, "x_init", x_init) ... end
printsinteger n_x call add_parm_to_context (context, "N_x", 32) call get_parm_from_context (context, "n_X", n_x) write (*,*) n_x call set_parm_in_context (context, "n_x", 64) call get_parm_from_context (context, "N_X", n_x) write (*,*) n_x
32 64
The following routines, data types, and public constants are provided by this module. Note that the main FLASH initialization routine (init_flash()) and the initialization code created by setup already handle the creation of the database and the parsing of the parameter file, so users will mainly be interested in querying the database for parameter values.
Data type for contexts.
A parameter context available to all program units which use the runtime_parameters module. This is made available only for programs which share the rest of their data among routines via included common blocks. Programs which use modules should declare their own contexts within their modules.
Constants returned by get_parm_type_from_context().
Create context c.
Destroy context c, freeing up the memory occupied by its database.
Add a parameter named p to context c. p is a character string naming the parameter, and v is the default or initial value to assign to the parameter. v can be of type real, integer, string, or logical. The type of v sets the type of the parameter; subsequent sets or gets of the parameter must be of the same type, or an error message will be printed.
Set the value of parameter p in context c equal to v. p is a character string naming the parameter, which must already have been created by add_parm_to_context (else an error message is printed). The type of v must match the type of the initial value used to create the parameter, else an error message is printed.
Query context c for the value of parameter p and return this value in the variable v. Parameter p must already exist, and the type of v must match the type of the initial value used to create the parameter, else an error message is printed.
Query context c for the data type of parameter p. The result is returned in t, which must be of type integer. Possible return values are parm_real, parm_int, parm_str, parm_log, and parm_invalid. parm_invalid is returned if the named parameter does not exist.
Print (to I/O unit l) the names and values of all parameters associated with context c.
Broadcast the parameter names and values from a specified context c to all processors. p is the calling processor's rank, and r is the rank of the root processor (the one doing the broadcasting).
The driver supplies a Fortran 90 module called physical_constants, which maintains a centralized database of physical constants. The database can be queried by string name and optionally converted to any chosen system of units. The default system of units is CGS. This facility makes it easy to ensure that all parts of the code are using a consistent set of physical constant values and unit conversions and to update the constants used by the code as improved measurements become available.
For example, a program using this module might obtain the value of Newton's gravitational constant G in units of Mpc3 Gyr-2 M\odot-1 by calling
In this example, the local variable G is set equal to the result, 4.4983×10-15 (to five significant figures).call get_constant_from_db ("Newton", G, len_unit="Mpc", time_unit="Gyr", mass_unit="Msun")
Physical constants are taken from the 1998 Review of Particle Properties, Eur. Phys. J. C 3, 1 (1998), which in turn takes most of its values from Cohen, E. R. and Taylor, B. N., Rev. Mod. Phys. 59, 1121 (1987). The following routines are supplied by this module.
Return the value of the physical constant named n in the variable v. Optional unit specifications are used to convert the result. If the constant name or one or more unit names aren't recognized, a value of 0 is returned.
Add a physical constant to the database. n is the name to assign, and v is the value in CGS units. len, time, mass, temp, and chg are the exponents of the various base units used in defining the unit scaling of the constant. For example, a constant with units (in CGS) of cm3 s-2 g-1 would have len=3, time=-2, mass=-1, temp=0, and chg=0.
Add a unit of measurement to the database. t is the type of unit (``length,'' ``time,'' ``mass,'' ``charge,'' ``temperature''), n is the name of the unit, and v is its value in terms of the corresponding CGS unit. Compound units are not supported, but they can be created as physical constants.
Initialize the constants and units databases. Can be called by the user program, but doesn't have to be, as it is automatically called when needed (ie. if a ``get'' or ``add'' is called before initialization).
List the constants and units databases to the specified logical I/O unit.
Deallocate the memory used by the constants and units databases, requiring another initialization call before they can be accessed again.
FLASH includes a set of routines for monitoring performance. The routines start or stop a timer at the beginning or end of the routine(s) to be monitored and accumulate time in dynamically assigned accounting segments. At the completion of the program, the routines write out a performance summary. We note that these routines are not recommended for use in timing very short segments of code due to the overhead in accounting.
All of the source code for the performance monitoring can be found in the module file perfmon.F90. The list below contains the performance routines along with a short description of each. Many of the subroutines are overloaded to take either a module name or an integer index.
Initializes the performance accounting database. Calls system time routines to subtract out their initialization overhead.
Creates a timer and returns a unique integer index for the timer.
Subroutine that begins monitoring code module module or module associated with index id. If module is not associated with a previously assigned accounting segment, the routine creates one, whereas if id is not associated with one, then nothing is done. The parameter module is specified with a string (max 30 characters). Calling timer_start on the same module more than once without first calling timer_stop causes the current timer for that module to be reset (the accumulated time in the corresponding accounting segment is not reset). Timing modules may be nested as many times as there are slots for accounting segments (see MaxModules setting). Routine may be called with an integer index in addition to the name of the module.
Stops accumulating time in the accounting segment associated with code module module. If timer_stop is called for a module which does not exist or for a module which is not currently being timed, nothing happens. Routine may be called with an integer index in addition to the name of the module.
Returns the current value of the timer for the accounting segment associated with the code module module or referenced by index id. If timer_value is called for a module which does not exist, 0. is returned.
Resets the accumulated time in the accounting segment corresponding to the specified code module. Routine may be called with an integer index in addition to the name of the module.
Function that given a string module name returns an integer index. The integer index can be used in any of the overloaded timer routines. If a timer name is not found, the function returns timer_invalid. Use of this function to obtain an integer index and subsequently calling the routines by that index rather than the string name is encouraged for performance reasons.
Subroutine that writes a performance summary of all current accounting segments to the file associated with logical unit number lun. Included is the average over n_period intervals (eg. timesteps). The accounting database is not reinitialized. lun and n_period are of default integer type. Calling perf_summary stops all currently running timers.
Below is a very simple example of calling the performance routines.
program example integer i call timer_init do i = 1, 1000 call timer_start ('blorg') call blorg call timer_stop ('blorg') call timer_start ('gloob') call gloob call timer_stop ('gloob') enddo call perf_summary (6, 1000) end
The driver supplies a Fortran 90 module called logfile to manage the FLASH log file, which contains various types of useful information, warnings, and error messages produced by a FLASH run. User-written routines may also make use of this module as needed. The logfile routines enable a program to open and close a log file, write time or date stamps to the file, and write arbitrary messages to the file. The file is kept closed and is only opened for appending when information is to be written, avoiding problems with unflushed buffers. For this reason, logfile routines should not be called within time-sensitive loops, as the routines will generate system calls.
An example program using the logfile module might appear as follows:
The following routines, data types, and public constants are provided by this module.program test use logfile integer :: i call create_logfile ("test.log", "test.par", .false.) call stamp_logfile ("beginning log file test...") do i = 1, 10 call open_logfile write (log_lun,*) 'i = ', i call close_logfile enddo call stamp_logfile ("done with log file test.") end
Creates the named log file and writes some header information to it, including the build stamp and the values of all runtime parameters in the global parameter context. The name of the parameter file is taken as an input; it is echoed to the log file. If restart is .true., the file is opened in append mode.
Write a date stamp and a specified string to the log file.
Write a dated timestep stamp for step n, time t, timestep dt to the log file. n must be an integer, while t and dt must be reals.
Write a string to the log file without a date stamp.
Write a `break' (a row of ) to the log file.
Open the log file for writing, creating it first with a default name (logfile) if necessary. open_logfile() and close_logfile() should only be used if it is necessary to write something directly to the log file unit with some external routine.
Close the log file.
The logical unit number being used by the logfile module (to permit direct writes to the log file by external routines).
Currently FLASH can store simulation data in three basic output formats: Fortran 77 binary, Hierarchical Data Format (HDF) (sometimes called HDF 4), and HDF5. In general, these format are not compatible, but some tools for translating from one format to the other exist. These formats control how the binary data is stored on disk, how to address it, what to do about different data storage on different platforms. The mapping of FLASH data-structures to records in these files is controlled by the FLASH I/O modules. These file formats have different strengths and weaknesses, and the data layout is different for each file type.
Different techniques can be used to write the data to disk: move all the data to a single processor for output; have each processor write to a separate file; and parallel access to a single file. In general, parallel access to a single file will provide the best performance. On some platforms, such as Linux clusters, there may not be a parallel filesystem, so moving all the data to a single process is the best solution.
The default I/O module in FLASH is hdf4, which uses the HDF format. The HDF format provides an application programming interface (API) for organizing data in a database fashion. In addition to the raw data, information about the data type and byte ordering (little or big endian), rank, and dimensions of the dataset is stored. This makes the HDF format extremely portable across platforms. Different packages can query the file for its contents, without knowing the details of the routine that generated the data.
HDF is limited to files < 2 GB is size. Furthermore, the official release of HDF does not support parallel I/O. To address these limitations, HDF5 was released. HDF5 is supported on a large variety of platforms, and offers the same functionality as HDF, with the addition of large file support and parallel I/O via MPI-I/O. Information of the different versions of HDF can be found at http://hdf.nsca.uiuc.edu. This section assumes that you already have the necessary HDF libraries installed on your machine.
The I/O modules in FLASH have two responsibilities-generating and restarting from checkpoint files, and generating plot files. A checkpoint file contains all the information needed to restart the simulation. The data is stored at the same precision (8-byte reals) as it is carried in the code, and includes all of the variables. A plotfile contains all the information needed to interpret the tree structure maintained by FLASH and it contains a user-defined subset of the variables. Furthermore, the data may be stored at reduced precision to conserve space.
The type of output you create will depend on what type of machine you are running on, the size of the resulting dataset, and what you plan on doing with the datafiles once created. Table 2 summarizes the different modules which come with FLASH 2.0.
It is strongly recommended that you use one of the HDF5 output formats
with FLASH. These are currently the best performing I/O modules in
FLASH. Furthermore, support for HDF5 exists for just about every
platform you are likely to encounter.
There are several parameters that control the frequency of output,
type of output, and name of the output files. These parameters are
the same for all the different modules, although not every module is
required to implement all parameters. Some of these parameters are
used in the top level I/O routines (initout.F90, output.F90,
and finalout.F90) to determine when to output, while others are
used to determine the resulting filename. Table 3
gives a description of the I/O parameters.
5.1 General parameters
Parameter | Type | Default value | Description |
| |||
Parameter | Type | Default value | Description |
rolling_checkpoint | INTEGER | 10000 | The number of checkpoint files to keep available at any point in the simulation. If a checkpoint number is > rolling_checkpoint, then the checkpoint number is reset to 0. There will be at most rolling_checkpoint checkpoint files kept. This parameter is intended to be used when disk space is at a premium. |
wall_clock_checkpoint | REAL | 43200 | The maximum amount of wall clock time to go between checkpoints. When the simulation is started, the current time is stored. If wall_clock_checkpoint seconds elapse over the course of the simulation, a checkpoint file is stored. This is useful to ensure that a checkpoint file is produced before a queue closes. |
basenm | STRING | ``chkpnt'' | The main part of the output filenames. The full filename consists of the basename a series of three character abbreviations indicating whether it is a plotfile or checkpoint file, and the format, and a 4-digit file number. See § 5.1.1 for a description of how FLASH output files are named. |
cpnumber | INTEGER | 10000 | The number of the current checkpoint file. This number is appended to the end of the basename when creating the filename. When restarting a simulation, this indicates which checkpoint file to use. |
ptnumber | INTEGER | 1 | The number of the current plotfile. This number is appended to the end of the basename when creating the filename. |
restart | BOOLEAN | .false. | A logical variable indicating whether the simulation is restarting from a checkpoint file (.TRUE.) or starting from scratch (.FALSE.). |
nrstrt | INTEGER | 10000 | The number of timesteps desired between subsequent checkpoint files. |
trstrt | REAL | 1 | The amount of simulation time desired between subsequent checkpoint files. |
tplot | REAL | 1 | The amount of simulation time desired between subsequent plotfiles. |
corners | BOOLEAN | .false. | A logical variable indicating whether to interpolate the data to cell corners before outputting. This only applies to plotfiles. |
plot_var_1, ... plot_var_8 | STRING | ``none'' | Name of the variables to store in a plotfile. Up to 8 variables can be selected for storage, and the standard 4-character variable name can be used to select them. |
| |
| |
|
| |
| |
|
| |
|
In a typical production run, your simulation can be interupted for a number of reasons-machine crashes, the present queue window closes, the machine runs out of disk space, or (*gasp*) a bug in FLASH. Once the problem is fixed, you do not want to start over from the beginning of the simulation, but rather would like to pick up where you left off.
FLASH is capable of restarting from any of the checkpoint files it produces. You will want to make sure the file you wish to restart from is valid (i.e. the code did not stop while outputting). To tell FLASH to restart, set the restart runtime paramter to .TRUE. in your flash.par. You will also want to set cpnumber to the number of the file you wish to restart from. Finally, if you are producing plotfiles, you will want to set ptnumber to the number of the next plotfile you want FLASH to output. Sometimes several plotfiles may be produced after the last valid checkpoint file, so resetting ptnumber to the first plotfile produced after the checkpoint you are restarting from will ensure that there are no gaps in your output.
The HDF module writes the data to disk using the HDF 4.x library. This module should be supported with HDF 4.1r2 or later. A single file containing the data on all processors is created, if the total size of the dataset is < 2 GB. If there is more than 2 GB of data to be written, multiple files are created to store the data. The number of files used to store the dataset is contained in the number of files record. Each file contains a subset of the blocks (stored in the local blocks record) out of the total number of blocks in the simulation. The blocks are divided along processor boundaries.
The HDF module performs serial I/O-each processors' data is moved to the master processor to be written to disk. This has the advantage of producing a single file for the entire simulation, but is less efficient than if each processor wrote to disk directly. The plotfiles produced with the HDF module contain double precision data. Support for corner data is available with this module.
Machine Compatibility
HDF has been tested successfully on most machines, with the exception
of (ASC) Red. The HDF library will properly handle different byte
orderings across platforms. The IDL tools provided with FLASH will
read the FLASH HDF data.
Data Format
Table 4 summarizes the records stored in a FLASH HDF file. The
format of the plotfiles and checkpoint files are the same, with the
only difference being the number of unknowns stored. The records
contained in a HDF file can be found via the hdp command that
is part of the HDF distribution. The syntax is hdp dumpsds -h
filename. The contents of a record can be output by hdp dumpsds
-i N filename, where N is the index of the record.
As described above, HDF cannot produce files > 2 GB in size. This is overcome in the FLASH HDF module by splitting the dataset into multiple files when it would otherwise produce a file larger than 2 GB in size.
When reading a single unknown from a FLASH HDF file, you will suffer performance penalties, as there will be many non-unit-stride accesses on the first dimension, since all the variables are stored together in the file.
There are two major HDF5 modules, the serial and parallel version.
The format of the output file produced by these modules is identical;
only the method by which it is written differs. It is possible to
create a checkpoint file with the parallel routines and restart FLASH
from that file using the serial routines. In each module, the plot
data are written out in single precision to conserve space. These
modules require HDF5 1.4.0 or later. At the time of this writing, the
current version of HDF5 is 1.4.1.
5.3.2.1 Machine Compatibility
The HDF5 modules have been tested successfully on the three (ASC)
platforms and a Linux cluster. Performance varies widely across the
platforms, but the parallel version is usually faster than the serial
version. Experience on performing parallel I/O on a Linux Cluster
using PVFS is reported in Ross et al. (2001). A shared object library
provided with FLASH provides IDL with the ability to read the FLASH
files.
5.3.2.2 Data Format
The data format FLASH uses for HDF5 output files is similar to that of
the HDF files, but there are a few differences that make a record-to-
record translation impossible. These changes were made to maximize
performance. Instead of putting all the unknowns in a single HDF
record, unk(nvar,nx,ny,nz,tot_blocks) as in the HDF file, each
variable is stored in its own record, labeled by the 4-character
variable name. A number of smaller records (time,
timestep, number of blocks, ...) are stored in a single
structure in the HDF5 file to reduce the number of writes required.
Finally, the two bounding box records in the HDF file are merged into
a single record in the HDF5 file. This allows for easier access to a
single variable when reading from the file. The HDF5 format is
summarized in table 5.
5.3.2 HDF5
Unlike the HDF files, the f77 files do not have a record based format.
This means it is necessary to know the precise way the data was
written to disk, to construct the analogous read statement. In general,
the data for a single block (tree, grid, and unknown information) is
stored together in the file. The different f77 modules use different
precision for the plotfiles, and this must be taken into account
when reading the data.
The f77 modules with `_parallel' appended to the name create a
single file from each processors, instead of first moving the data to
processor 0 for writing. This is a lot faster on most platforms. It
has the disadvantage of generating a lot of files, which must all be
read in to recreate the dataset.
The f77_unf_parallel{_single} modules include a converter that will
read in the header file for the plotfiles, and create a single HDF5
plotfile containing all the data. This conversion routine is
unsupported, but provided for completeness.
The HDF output formats offer the greatest flexibility when visualizing
the data, as the visualization program does not have to know the
details of how the file was written, rather it can query the file to
find the datatype, rank, and dimensions. The HDF formats also avoid
difficulties associated with different platforms storing numbers
differently (big endian vs. little endian). IDL routines for
reading the FLASH HDF and HDF5 formats are provided in
FLASH 2.0/tools/fidlr/. These can be used interactively though the
IDL command line 15.6.
We have used a package known as PARAMESH (MacNeice et al. 1999) for
the parallelization and adaptive mesh refinement (AMR) portion of FLASH.
PARAMESH
consists of a suite of subroutines which handle refinement/derefinement,
distribution of work to processors, guard cell filling, and flux conservation.
In this section we briefly describe this package and the
ways in which it has been modified for use with FLASH.
The refinement criterion used by PARAMESH is adapted from Löhner (1987).
Löhner's error estimator was originally
developed for finite element applications and has the advantage that
it uses an entirely local calculation.
Furthermore, the estimator is
dimensionless and can be applied with complete generality to any of the
field variables of the simulation or any combination of them
(by default, PARAMESH uses the density and pressure).
Löhner's estimator is a modified second derivative, normalized
by the average of the gradient over one computational cell. In one
dimension on a uniform mesh it is given by
PARAMESH uses a block-structured adaptive mesh refinement scheme similar
to others in the literature (e.g., Parashar 1999;
Berger & Oliger 1984; Berger & Colella 1989; DeZeeuw & Powell 1993) as well
as to schemes which refine on an individual cell basis (Khokhlov
1997). In block-structured AMR, the fundamental data structure is a block of
uniform cells arranged in a logically Cartesian fashion. Each cell can be
specified using a block identifier (processor number and local block number) and
a coordinate triple (i,j,k), where i=1¼nxb, j=1¼nyb,
and k=1¼nzb refer to the x, y, and z directions, respectively. The
complete computational grid consists of a collection of blocks with different
physical cell sizes, related to each other in a hierarchical fashion using
a tree data structure.
The blocks at the root of the tree have the largest cells, while their
children have smaller cells and are said to be refined.
Two rules govern the establishment of refined child blocks in PARAMESH.
First, the cells of a refined child block must be one-half as large as
those of
its parent block. Second, a block's children must be nested; that is,
the child blocks must fit within their parent block and
cannot overlap one another, and the complete set of children of a block
must fill its volume.
Thus, in d dimensions a given block has either zero or 2d children. A
simple domain is shown in Figure 7.
Each block contains nxb×nyb×nzb
interior cells and a set of guard cells (Figure 5).
The guard cells contain boundary information needed to update
the interior cells. These can be obtained from physically neighboring blocks,
externally specified boundary conditions, or both.
5.3.3 Fortran 77 (f77)
5.4 Working with output files
6 Mesh module
6.1 Algorithm
where ui is the refinement test variable's value in the ith cell.
The last term in the denominator of this expression acts as a filter,
preventing refinement of small ripples. The constant
e is given a value of 10-4. Although PPM is formally
second-order and its leading error terms scale as the third derivative,
we have found the second derivative criterion to be very good at detecting
discontinuities in the flow variable u. When
extending this criterion to multidimensions, all cross derivatives are
computed, and the following generalization of the above expresion is
used:
Ei =
| ui+1 - 2ui + ui-1 |
, (10)
where the sums are carried out over coordinate directions, and where,
unless otherwise noted, partial
derivatives are evaluated at the center of the i1i2i3-th zone.
Ei1i2i3 =
ì
ï
ï
í
ï
ï
î
å
pq
æ
è
¶2 u
DxpDxq
ö
ø
2
å
pq
é
ë
æ
è
ê
ê
¶u
ê
ê
ip+1/2
+
ê
ê
¶u
ê
ê
ip-1/2
ö
ø
Dxp + e
¶2 |u|
DxpDxq
ù
û
2
ü
ï
ï
ý
ï
ï
þ
1/2
, (11)
6.2 Usage
The number of guard cells needed depends upon the interpolation scheme and differencing stencil used for the hydrodynamics algorithm; for the explicit PPM algorithm distributed with FLASH, four guard cells are needed in each direction, as illustrated in Figure 6.
PARAMESH handles the filling of guard cells with information from other blocks or a user-specified external boundary routine. If a block's neighbor has the same level of refinement, PARAMESH fills its guard cells using a direct copy from the neighbor's interior cells. If the neighbor has a different level of refinement, the neighbor's interior cells are used to interpolate guard cell values for the block. If the block and its neighbor are stored in the memory of different processors, PARAMESH handles the appropriate parallel communication (blocks are not split between processors). PARAMESH supports only linear interpolation for guard cell filling at jumps at refinement, but it is easily extended to allow other interpolation schemes. In FLASH, several different interpolation methods can be chosen at setup time. Each interpolation scheme is stored in a subdirectory under /source/mesh/paramesh_nobarr. Once each block's guard cells are filled, it can be updated independently of the other blocks. PARAMESH also enforces flux conservation at jumps in refinement, as described by Berger and Colella (1989). At jumps in refinement, the fluxes of mass, momemtum, energy (total and internal), and species density in the fine cells across boundary cell faces are summed and passed to their parent. The parent and the neighboring cell are at the same level of refinement (because PARAMESH limits the jumps in refinement to be one level between blocks). The flux in the parent that was computed by the more accurate fine zones is taken as the correct flux through the interface, and it is passed to the corresponding coarse face on the neighboring block (see Figure 7). The summing allows a geometrical weighting to be implemented for non-Cartesian geometries, which ensures that the proper corrected flux is computed.
|
Each processor decides when to refine or derefine its blocks by computing a user-defined error estimator for each block. Refinement involves creation of either zero or 2d refined child blocks, while derefinement involves deletion of a block and all its siblings (2d blocks). As child blocks are created, they are temporarily placed at the end of the processor's block list. After the refinements and derefinements are complete, the blocks are redistributed among the processors using a work-weighted Morton space-filling curve in a manner similar to that described by Warren and Salmon (1987) for a parallel treecode. An example Morton curve is shown in Figure 8.
|
|
During the distribution step each block is assigned a work value (an estimate of the relative amount of time required to update the block). The Morton number of the block is then computed by interleaving the bits of its integer coordinates as described by Warren and Salmon (1987); this determines its location along the space-filling curve. Finally, the list of all blocks is partitioned among the processors using the block weights, equalizing the estimated workload of each processor.
Dividing the domain is the first step in the mesh-generation process. This routine is responsible for creating the initial top-level block(s) and setting the neighbors of these blocks correctly. These initial blocks then form the top of the tree, and new blocks may be created by refining these top blocks.
By default, FLASH generates an initial mesh with only one top-level block. There are times when this is inconvenient; for instance, when simulating a domain longer in one dimension than the other while wanting equal spatial resolution in each dimension.
divide_domain() creates an initial mesh of nblockx * nblocky * nblockz top level blocks, where nblockx, nblocky, and nblockz are runtime parameters which default to 1. These blocks are all created on one processor, and thus the total number of these top-level blocks may not exceed the compiled-in parameter MAXBLOCKS.
Since divide_domain() is responsible for setting the neighbors of top-level blocks correctly (to either other top-level blocks or to external boundary conditions calcluated by tot_bnd), this is also where the periodic boundary conditions are initially set up. If periodic boundary conditions are set in (for instance) the x-direction, the blocks that are first in the x-direction are set to have as their left-most neigbor the blocks that are last in the x-direction, and vice versa. Thus, when the guard cell filling is performed, the periodic boundary conditions are automatically maintained.
In the maintenance of the tree structure during refinement or derefinement, many small messages must be sent between processors. On any system with a non-negligable latency time for sending messages, communications costs can be significantly reduced by batching these many small messages into fewer large messages.
The routines in batchsend.F90 and batchsend_dbl.F90 do simple message buffering. In several amr routines, all blocks need to send to their neighbors a small number of pieces of data along with a tag, as well as recieve some data. The processors they are to send to and recieve from are known ahead of time. The routines b_int_sendrecv(), b_logical_sendrecv(), and b_dbl_sendrecv() take as input arrays containing the messages, tags, and processors to send or recieve from, and batch them so that as few messages as possible go between processors.
Because of the amount of copying and memory allocation involved in the process, this buffering does have a cost, and thus under some circumstances may produce a performance loss rather than gain. Thus, the message buffering may be turned on or off with the logical runtime parameter msgbuffer, which is .false. by default.
By default, FLASH uses a Cartesian geometry when discretizing the computational domain. The only other geometry currently available in FLASH is 2-d cylindrical (r,z) coordinates. This coordinate system is useful for problems that are axisymmetric. In FLASH, it is assumed that the cylindrical radial coordinate is in the `x'-direction, and the cylindrical z-coordinate is in the `y'-direction. To run FLASH with cylindrical coordinates, the igeomX runtime parameters must be set properly:
igeomx = 1 igeomy = 0 igeomz = 3These parameters are interpreted by the hydrodynamics solvers and add the necessary geometrical factors to the divergence terms.
As discussed in the AMR section, to ensure conservation at a jump in refinement, a flux correction step is taken. Here we use the fluxes leaving the fine zones adjacent to a coarse zone to make a more accurate flux entering the coarse zone.
Figure 9 shows a jump in refinement along the cylindrical
`z' direction. When performing the flux correction step at a jump in
refinement, we must take into account the area of the annulus that
each flux passes through to do the proper weighting. We define the
cross-sectional area the z-flux passes through as
| (12) |
| (13) |
For fluxes in the radial direction, the cross-sectional area is independent of the height, z, so the corrected flux is simply taken as the average of the flux densities in the adjacent finer zones.
By default, FLASH will run a problem on an adaptive mesh, keeping the level of refinement of a block between lrefine_min and lrefine_max. Sometimes it is useful to run a problem with a uniform mesh. While there is no explicitly uniform mesh module distributed with FLASH, the paramesh module can be run in a uniform `mode', which will have only slightly more overhead that a purely uniform mesh module. The basic steps to set this up are outlined below.
A typical problem in FLASH is set up with a single block at the top of the tree. As the refinement criteria is applied to the initial conditions, this block and any children are refined to create the initial mesh. If you are running on a uniform grid, there is no need to carry around the entire tree hierachry, only the leaf blocks are needed. To get around this, we can use the divide_domain functionality (see section ??) to create enough as many top level blocks as are needed to satisfy our resolution needs. This is accomplished by using the nblockx, nblocky, and nblockz runtime parameters to specify how many blocks to create in each direction.
Since you are placing the same resolution everywhere in the domain, it is no longer advantageous to use small blocks. Instead, the number of zones in a block can be increased, which will reduce the memory overhead (ratio of guardcells to interior zones in a single block). At present, the only way to do this in FLASH is to modify physicaldata.fh, and increase the values of nxb, nyb, and nzb. Please note, some sections of the code assume that these quantities are even.
The number of computation zones in each direction can be computed as:
| (14) |
Since there will be no refinement in the problem, the next step is to instruct the code to no longer check for refinement. This is accomplished by setting nref to a very large number. nref is the frequency (in time steps) to check the refinement criteria, and defaults to 2.
Finally, since there will be no jumps in refinement, the flux conserveration step is not necessary. This can be eliminated by commenting out the FLUX preprocessor definition in hydro_sweep. This will instruct the code to skip over the conservation step.
The hydro module solves Euler's equations for compressible gas dynamics
in one, two, or three spatial dimensions. These equations can be
written
in conservative form as
|
| (18) |
| (19) |
In regions where the kinetic energy greatly dominates the
total energy, computing the internal energy using
| (20) |
| (21) |
For reactive flows, a separate advection equation must be solved
for each chemical or nuclear species:
| (22) |
All hydrodynamic modules, as well as the MHD module described in Section 8, supply the runtime parameters and solution variables described in Tables 6 and 7. Two hydrodynamic modules are included. The first, discussed in Section 7.1, is based on the directionally split Piecewise-Parabolic Method (PPM) and makes use of second-order Strang time splitting. The second, discussed in 7.2, is based on Kurganov methods and can make use of Strang splitting or Runge-Kutta time advancement. Explicit, directionally split solvers like the PPM solver make use of the additional runtime parameter described in Table 8.
Variable | Type | Default | Description |
eint_switch | real | 10-4 | If e < eint_switch ·1/2|v|2, use the internal energy equation to update the pressure |
irenorm | integer | 0 | If equal to one, renormalize multifluid abundances following a hydro update; else restrict their values to lie between smallx and 1. |
Variable | Attributes | Description |
dens | ADVECT NORENORM CONSERVE | density |
velx | ADVECT NORENORM NOCONSERVE | x-component of velocity |
vely | ADVECT NORENORM NOCONSERVE | y-component of velocity |
velz | ADVECT NORENORM NOCONSERVE | z-component of velocity |
pres | ADVECT NORENORM NOCONSERVE | pressure |
ener | ADVECT NORENORM NOCONSERVE | specific total energy (T+U) |
temp | ADVECT NORENORM NOCONSERVE | temperature |
Variable | Type | Default | Description |
cfl | real | 0.8 | Courant-Friedrichs-Lewy (CFL) factor; must be < 1 for stability in explicit schemes |
FLASH includes a directionally split Piecewise-Parabolic Method (PPM) solver descended from the PROMETHEUS code (Fryxell, Müller, and Arnett 1989). PPM is described in detail in Woodward and Colella (1984) and Colella and Woodward (1984). It is a higher-order version of the method developed by Godunov (1959). Godunov's method uses a finite-volume spatial discretization of the Euler equations together with an explicit forward time difference. Time-advanced fluxes at cell boundaries are computed using the analytic solution to Riemann's shock tube problem at each boundary. Initial conditions for each Riemann problem are determined by assuming the nonadvanced solution to be piecewise constant in each cell. Using the Riemann solution has the effect of introducing explicit nonlinearity into the difference equations and permits the calculation of sharp shock fronts and contact discontinuities without introducing significant nonphysical oscillations into the flow. Since the value of each variable in each cell is assumed to be constant, Godunov's method is limited to first-order accuracy in both space and time.
PPM improves on Godunov's method by representing the flow variables with piecewise parabolic functions. It also uses a monotonicity constraint rather than artificial viscosity to control oscillations near discontinuities, a feature shared with the MUSCL scheme of van Leer (1979). Although this could lead to a method which is accurate to third order, PPM is formally accurate only to second order in both space and time, as a fully third-order scheme proved not to be cost-effective. Nevertheless, PPM is considerably more accurate and efficient than most formally second-order algorithms.
PPM is particularly well-suited to flows involving discontinuities, such as shocks and contact discontinuities. The method also performs extremely well for smooth flows, although other schemes which do not perform the extra work necessary for the treatment of discontinuities might be more efficient in these cases. The high resolution and accuracy of PPM are obtained by the explicit nonlinearity of the scheme and through the use of intelligent dissipation algorithms, such as monotonicity enforcement, contact steepening, and interpolant flattening. These algorithms are described in detail by Colella and Woodward (1984).
A complete description of PPM is beyond the scope of this user's guide. However, for comparison with other codes, we note that the implementation of PPM in FLASH 2.x uses the direct Eulerian formulation of PPM and the technique for allowing nonideal equations of state described by Colella and Glaz (1985). For multidimensional problems, FLASH 2.x uses second-order operator splitting (Strang 1968).
The hydro/explicit/split/ppm module supplies the runtime parameters described in Table 9.
Variable | Type | Default | Description |
epsiln | real | 0.33 | PPM shock detection parameter e |
omg1 | real | 0.75 | PPM dissipation parameter w1 |
omg2 | real | 10 | PPM dissipation parameter w2 |
igodu | integer | 0 | If set to 1, use the Godunov method (completely flatten all interpolants) |
vgrid | real | 0 | Scale factor for dissipative grid velocity |
nriem | integer | 10 | Max number of iterations to use in Riemann solver |
rieman_tol | real | 10-5 | Convergence factor for Riemann solver |
cvisc | real | 0.1 | Artificial viscosity constant |
oldvisc | boolean | .true. | If .true., use the original PROMETHEUS artificial viscosity; else use a newer one developed by Müller and Plewa |
Any of several diffusive processes can be added to the Euler equations in the PPM module. All of these are treated explicitly in FLASH, and follow the same approach: a diffusive flux is calculated by assuming that the diffusive flux of a quantity is proportional to the gradient of the quantity. The gradient is calculated by finite difference. The fluxes are then calculated and added to the fluxes generated by the PPM module. This addition is done before any of the zones are updated in the hydro step. This ensures conservation, since the total flux (including the diffusive flux) will be corrected during the flux conservation step.
To include a diffusive process, you must modify your Modules
file to use
source/hydro/explicit/split/ppm/diffuse. Then the
logical runtime parameters
diffuse_therm, diffuse_visc, and
diffuse_species should be set to .TRUE. or .FALSE.
depending on whether you wish to include these diffusive terms or not
in your simulation.
The energy equation in the PPM module can be modified to include
thermal diffusion:
| (23) |
| (24) |
There are several conductivity modules available for use with this
routine in
source/materials/conductivity, and one of these must be
included in your Modules file. conductivity/stellar uses a
conductivity appropriate for the degenerate matter of stellar interiors.
In conductivity/constant the heat conductivity is assumed constant
- s is set equal to the runtime parameter conductivity_constant.
In conductivity/constant-diff, the thermal diffusivity (l = [(s)/(rcp)]) is kept equal to the runtime parameter
diff_constant. This is equivalent to diffusing temperature directly,
e.g.
| (25) |
With viscosity, it is velocity which is diffused:
| (26) |
The species diffusion diffuses number density of species. Thus,
there are ionmax fluxes updated, and they are controlled by
an assumed constant diffusivity diff_spec_D.
| (27) |
The two Kurganov schemes are implemented in the kurganov hydro module, which is compatible with all the new driver modules. The module is organized as follows. The subroutine hydro_3d is essentially a wrapper to the subroutines kurganov_block_x, kurganov_block_y, and kurganov_block_z. These three subroutines implement the reconstruction step for each of the spatial dimensions, and call kurganov_line, which calculates either the KT or KNP numerical fluxes.
The new hydro module provides two of Kurganov's high-resolution central schemes, each second-order accurate. These spatial discretization methods for the advective terms provide robust shock-capturing without characteristic decompositions, which makes them relatively inexpensive compared to other shock-capturing schemes. However, they may have a lower critical time step for stability and higher dissipation as a result.
Kurganov and his collaborators have developed a a class of numerical methods for hyperbolic conservation laws. Compared to other methods for such systems, the Kurganov methods are simple and inexpensive because they do not rely on characteristic decompositions or Riemann solvers; the only information they require from the equations is the maximum and minimum signal propagation speeds. The Kurganov methods evolved from methods developed by Tadmor and his colleagues, but differ in that staggered grids are not used in the implementation, only as a device in the derivation of the schemes.
There are two parts to each Kurganov method, (i) reconstruction of the conserved variables, which provides cell interface values, and (ii) computation of the interface flux from those interface values. Various second- and third-order reconstruction algorithms have been developed; one-dimensional reconstructions can be extended dimension-by-dimension, but some multidimensional third-order reconstructions have been proposed. Two formulas are available for computing the fluxes from the interface values. For the first numerical flux, several wave speeds were represented by a single estimate, the maximum magnitude of the eigenvalues of the flux Jacobian (Kurganov and Tadmor 2000). Later two estimates (maximum and minimum eigenvalues) were used, resulting in an improved numerical flux with reduced numerical dissipation (Kurganov, Noelle, and Petrova 2001).
Next the second-order reconstruction used in the new hydro module will be described. The reconstruction is one-dimensional, and is presented for an equispaced mesh. An arbitrary mesh cell is referred to by subscript i. The reconstruction uses the cell-averaged conserved variables, Ui, at nearby cells to produce values of the conserved variables at the left and right sides of each cell interface, Uli+[ 1/2] and Uri+[ 1/2], respectively. To update a given cell fluxes at two interfaces must be computed, so for the reconstruction presented, the this update requires a five-point stencil.
The first step is to compute limited slopes, (Ux)i:
| (28) |
Once the slopes have been determined, the interface values are calculated by
|
The Kurganov-Noelle-Petrova (KNP) numerical flux is defined by
| (31) |
|
The Kurganov-Tadmor (KT) numerical flux is
| (35) |
| (36) |
The hydro_3d subroutine was written as generally as possible - with minimal modification, it can be used as a wrapper for most shock-capturing schemes which compute numerical fluxes. It accepts, as an argument, the spatial direction - x, y, z, or all - for which the fluxes should be computed. It can therefore be used with time advancement methods based on directional splitting. In order, the tasks handled by hydro_3d include:
The kurganov module accepts two runtime parameters, listed in table 10. The integer knp selects which numerical flux formula is to be used, KNP or KT. The real-valued lim_theta is q in eq. (28); lower values result in more damping, higher values in less, within the range listed in the table. The runtime parameter cfl is common to all hydro modules; table 10 lists only those specific to the Kurganov schemes. Because of the reconstruction, a five-point stencil is required at each mesh cell; consequently, two guard cells are required for a block. Corner guard cells are not required since the reconstruction is one-dimensional.
Para. name | Type | Default | Description |
knp | integer | 1 | Specifies KNP numerical flux is to be used; for KT, set knp=0 |
lim_theta | real | 2.0 | Adjusts slope limiter; 1.0 £ lim_theta £ 2.0; choose 1.0 for minmod limiter, 2.0 for monotonized centered limiter |
7.2.2.1 Interaction with delta and state-vector driver modules
New driver, formulation, and hydro modules have been implemented in a manner which maximizes flexibility. In this section the interaction between these three module classes is explained. More generally, though, the new hydro module represents all physics modules; it is an example of how other modules can be written so they can be used with the new driver and formulation modules. To use the new methods, the user should specify their inclusion through the Modules file, as described below.
The new hydro module, kurganov, can be used with drivers written in either delta or state-vector formulations. All physics modules can be written with this feature, and the kurganov module can be used as a guide for doing so.
On each block, the hydro module does the following. The contribution of the module, Lhydro(U), is computed. If a delta formulation time advancement has been specified, then Lhydro(U) is added to the global DU. If a state-vector formulation time advancement is being used, calls to update the solution on the block from Lhydro(U) are made. In both cases, the formulation module provides the subroutines for these actions. The line for the Modules file to specify the Kurganov scheme is:
/hydro/explicit/delta_form/kurganovThe directory names for some of the new modules are misleading. All the new time advancements are in a directory named delta_form, regardless of their formulation; this directory will be renamed multi_form in the near future. Similarly, /hydro/explicit/delta_form will be renamed /hydro/explicit/multi_form.
At present, the compatibility of the new modules with the rest of the FLASH code is limited. The new modules are incompatible with the default hydro module, which actually implements parts of the Strang splitting time advancement in addition to the PPM spatial discretization. The time advancements implemented in the delta formulation are not compatible with physics modules which update the variables, and those implemented in the state-vector formulation have not been tested with the other physics modules; these deficiencies are being addressed. Currently only Cartesian coordinates are supported by the new hydro module. The new modules have been extensively tested only for a nonreacting, single component gas (ionmax=1) using the gamma-law equation of state.
The magnetohydrodynamics module included with the FLASH code solves the equations of ideal MHD. As discussed in 8.1, the MHD module replaces the hydrodynamics module in simulations of magnetized fluids. The two modules are conceptually very similar, and they share the same algorithmic structure. In the current version of the FLASH code, the MHD module is a submodule of the hydrodynamics module. This hierarchy will change in future releases of the code, and hydro and MHD modules will be placed at the same level in the source module tree.
The currently released version of the MHD module uses directional splitting to evolve the equations of ideal magnetohydrodynamics. As the hydro module does, the MHD module (mhd.F90, mhd_3d.F90) makes three sweeps (functions mhd_x(), mhd_y() and mhd_z()) to advance physical variables from one time level to another one. In each sweep, the module uses AMR functionality to fill in guard cells and impose boundary conditions. Then it reconstructs characteristic variables (mhd_interpolate()) and uses these variables to compute time-averaged interface fluxes of conserved quantities. In order to enforce conservation at jumps in refinement the module calls amr_flux_conserve(), which redistributes affected fluxes using appropriate geometric area factors. Finally, the module updates the solution and calls eos3d() to ensure that the solution is thermodynamically consistent.
After all sweeps are completed the MHD module enforces magnetic field divergence cleaning. Two options are available: diffusive and elliptic projection cleaning. In order to select a particular method, the user must respectively specify either mhd/divb_diffuse or mhd/divb_project in his or her problem Config file. The default method is diffusive.
The interface of the MHD module is minimal. The module honors most of the hydro module interface functions and shares several common parameters. The most significant of them is the cfl number. At the time of this writing the module introduces only two MHD-specific runtime parameters:
The MHD module in the FLASH 2.0 code solves the equations of compressible magnetohydrodynamics in one, two and three dimensions. Written in non-dimensional (hence without 4p or m0 coefficients), conservation form these equations are
|
| (41) |
| (42) |
The above equations of ideal magnetohydrodynamics are solved using a high-resolution, finite-volume numerical scheme with MUSCL-type (van Leer 1979) limited gradient reconstruction. In order to maximize the accuracy of the solver the reconstruction procedure is applied to characteristic variables. Since this may cause certain variables such as density and pressure to fall out of physically meaningful bounds, extra care is taken in the limiting step to prevent this from happening. All other variables are calculated in the module from the interpolated characteristic variables.
In order to resolve discontinuous Riemann problems that occur at computational cell interfaces, the code employs a Roe-type solver derived in Powell et al. (1999). This solver provides full characteristic decomposition of the ideal MHD equations, and is therefore particularly useful for plasma flow simulations that feature complex wave interaction patterns. The time integration in the MHD module is done using a second-order, one-step method due to Hancock (Toro 1997). For linear systems with unlimited gradient reconstruction this method can be shown to coincide with the classic Lax-Wendroff scheme.
A difficulty particularly associated with solving the MHD equations numerically lies in the solenoidality of the magnetic field. The notorious Ñ·B =0 condition, a strict physical law, is very hard to satisfy in discrete computations. Being only an initial condition of the MHD equations it enter s the equations indirectly and is not therefore guaranteed to be generally satisfied unless special algorithmic provisions are made. Without discussing this issue in much detail, which goes well beyond the scope of this user's guide (for example, see Tóth (2000) and references therein), we will remind that there are three commonly accepted methods to enforce the Ñ·B condition: the elliptic projection method (Brackbill and Barnes 1980), the constrained transport method (Evans and Hawley 1988) and the truncation-level error method (Powell et al. 1999). In the FLASH 2.0 code, the truncation-error and elliptic cleaning methods are implemented.
In the truncation-error method, the solenoidality of the magnetic
field is enforced by including several terms proportional to Ñ·B.
This removes the effects of unphysical magnetic tension forces parallel
to the field and stimulates passive advection of magnetic monopoles if
these are spuriously created. In many applica
tions, this method has been
shown to be efficient and sufficient way to generate solutions of high
physical quality. However, it has also been shown (Tóth, 2000) that
this method can sometimes, for example in strongly discontinuous and
stagnated flows, lead to accumulation of magnetic monopoles whose
strength is sufficient to corrupt the physical quality of computed
solutions. In order to eliminate this deficiency, the FLASH 2.0 code
also uses a simple yet very effective method originally due to
Marder (1987) to destroy the magnetic monopoles on the scale on which
they are generated. In this method, a diffusive operator proportional
to ÑÑ·B is added to the induction equation, so that the
equation becomes
| (43) |
In order to eliminate magnetic monopoles completely, the FLASH 2.0 code includes an elliptic projection method. In this method, the unphysical divergence of magnetic field can be removed to any desired level down to machine precision. This is achieved by solving a Poisson equation for a correcting scalar field whose gradient removes contaminated field components when subtracted from the magnetic field. The Poisson solver needed for this operation is the multigrid solver that is also used by the gravity module.
FLASH has the ability to track multiple fluids, each of which can have their own properties. The materials module handles these, as well as other things like EOS, composition, and conductivities.
To access any of the fluid properties, you must use the multifluid database. This can be accomplished in any FLASH routine by including the line:
use multifluid_databasealong with any of the other modules you have included. This module provides interface functions that can be used to set or query a fluid's properties. As with the other databases in FLASH, most of the properties have both a string name and an integer key that can be used in the database call. Calling the function with the integer key will be faster, since it avoids expensive string comparisons. The available properties are listed in table 11.
name | integer key | property | data type |
``name" | N/A | fluid name | string |
``short name" | N/A | chemical symbol | string |
``num total" | mf_prop_A | A | real |
``num positive" | mf_prop_Z | Z | real |
``num neutral" | mf_prop_N | N | real |
``num negative" | mf_prop_E | E | real |
``binding energy" | mf_prop_Eb | binding energy | real |
``adiabatic index" | mf_prop_gamma | gamma | real |
Once the multifluid database is initialized (usually by the materials/composition module function init_materials(), the integer nfluids is publicly available, giving the number of fluids carried by FLASH.
An example of using the multifluid database to define two fluids two be tracked by FLASH is provided by the example setup, discussed in 2.2.
We now briefly discuss the various interfaces to the multifluid database. Many of these functions are overloaded to accept either string or integer properties (as listed in the table above), or to include optional arguments. We only discuss the generic interface here.
A quick way to set a number of properties for an individual fluid in a single subroutine call. Looks for the next uninitialized fluid ( init_mfluid_db() sets the names of all fluids to UNINITIALIZED) and sets its properties according to the values specified in the subroutine call. Properties can be specified in the order A, Z, N, E, ... or by keyword. For example,
call add_fluid_to_db ("helium", "He", A=4., Z=2., N=2.)Properties not specifically initialized are set to 0. The status parameter is an optional status variable.
This function call is usually used when initializing the fluids in FLASH. Each composition sets the properties for all the fluids in the routine init_materials().
Set the property p of fluid f to value v. The fluid, f can be specified either using its string name or index. p is either a string identifier or integer key specifying the property.. The value v can be real-valued or string-valued, depending on the property being modified. status is an optional variable which will be set to 0 if the operation was successful, -1 if not. Reasons why the operation can fail include: f out of bounds if f is an index; f not found if f is a string name; p is not a valid property identifier; v is not a valid value for the given property.
Like the setting version, but v now receives the value instead of setting it.
Return the value of the property p for all defined fluids. v must be an array of the correct type. status is an optional exit status variable. v is filled with values up to the minimum of (its size, number of fluids).
Find the database index i of a fluid named f. If init_mfluid_db() has not been called, or if the fluid name is not found, this function terminates with an error. Errors are signaled by setting i to MFLUID_STATUS_FAIL.
Given a property name p and an array of weights w, return the weighted sum of the chosen property in v. w should be an array of length equal to the number of fluids in the database, or else a one-element array. Typically, the weights used are the mass fractions of each of the fluids in the database. If it is neither, or if the named property is invalid, the routine terminates. The optional status variable is set to MFLUID_STATUS_OK or MFLUID_STATUS_FAIL depending on whether the summing operation was successful.
Same as query_mfluid_sum(), but compute the weighted sum of the inverse of the chosen property.
For example, the average atomic mass of a collection of fluids is typically
defined as:
| (44) |
call query_mfluid_suminv(mf_prop_A, xn(:), abarinv, error) abar = 1.e0 / abarinvwhere xn(:) is an array of the mass fractions for each species in FLASH.
Same as query_mfluid_sum(), but compute the weighted sum of the chosen property divided by the total number of particles (A).
Same as query_mfluid_sum(), but compute the weighted sum of the square of the chosen property.
Initialize the multifluid database. If this has not been called and one of the other routines is called, that routine terminates with an error. The typical FLASH user will never need to call this him-/herself, as this call is part of the init_materials() call in the composition submodule.
List the contents of the multifluid database in a snappy table format. Output goes to the logical I/O unit indicated by the lun parameter.
The eos module implements the equation of state needed by the hydrodynamical and nuclear burning solvers. Interfaces are provided to operate on an entire block (eos3d), a one-dimensional vector (eos1d), or for a single zone (eos_fcn). Additionally, these functions can be used to find the thermodynamic quantities from either the density, temperature, and composition, or density, internal energy, and composition.
Three sub-modules are available in FLASH 2.0: gamma, which implements a perfect-gas equation of state; multigamma, which implements a perfect-gas equation of state with multiple fluids, each of which can have its own adiabatic index (g), and helmholtz, which uses a fast Helmholtz free-energy table interpolation to handle the same degenerate/relativistic electrons/positrons, and includes radiation pressure and ions (via the perfect gas approximation). Full details of this equation of state are provided in Timmes & Swesty (1999).
As described above, FLASH evolves the Euler equations for compressible, inviscid flow. This system of equations must be closed by an additional equation that provides a relation between the thermodynamic quantities of the gas. This is known as the equation of state for the material, and its structure and properties depend on the composition of the gas.
It is common to call an equation of state (henceforth EOS) routine more than 109 times during the course of a simulation when calculating two- and three-dimensional hydrodynamic models of stellar phenomena. Thus, it is very desirable to have an EOS that is as efficient as possible, yet accurately represents the relevant physics, and considerable work can go into development of a robust and efficient EOS. While FLASH is capable of general equations of state, here we discuss the routines for three equations of state that are supplied with FLASH: an ideal-gas or gamma-law EOS, an EOS for a fluid composed of multiple gamma-law gases, and a tabular Helmholtz free energy EOS appropriate for stellar interiors. The gamma-law EOS consists of simple analytic expressions that make for a very fast EOS routine in both the case of a single gas or a mixture of gases. The Helmholtz EOS includes much more physics and relies on a table look-up scheme for performance. In this section we discuss the physics of these equations of state; the interfaces between the EOS routines and the codes are discussed in 9.2.
FLASH uses the method of Colella & Glaz (1995) to handle general
equations of state.
General equations of state contain 4 adiabatic indices (Chandrasekhar 1939), but
the method of Colella & Glaz parameterizes the EOS and requires only two of
the adiabatic indices. The first is necessary to calculate the adiabatic sound speed and
is given by
| (45) |
| (46) |
The gamma-law EOS models a simple ideal gas with a constant
adiabatic index g. Here we have dropped the subscript on g
because for an ideal gas all adiabatic indices are equal.
The relationship between pressure P and density
and specific internal energy r and e is
| (47) |
| (48) |
| (49) |
| (50) |
| (51) |
We note that the analytic expressions apply to both the forward (internal energy as a function of density, temperature, and composition) and backward (temperature as a function of density, internal energy and composition) relations. Because the backward relation requires no iteration in order to obtain the temperature, this EOS is quite inexpensive to evaluate. Despite its performance, use of the gamma-law EOS is limited due to its restricted range of applicability for astrophysical flash problems.
The Helmholtz EOS provided with the FLASH distribution contains more
physics and is appropriate for addressing astrophysical phenomena
in which electrons and positrons may be relativistic and/or degenerate
and in which radiation may significantly contribute to the thermodynamic
state. This EOS includes contributions from radiation,
completely ionized nuclei, and degenerate/relativistic electrons and positrons.
The pressure and internal energy are calculated as the sum over the components
| (52) |
| (53) |
The blackbody pressure and energy are calculated as
| (54) |
| (55) |
| (56) |
| (57) |
| (58) |
| (59) |
| (60) |
The above formalism requires many complex calculations to evaluate the thermodynamic quantities, and routines for these calculations typically are designed for accuracy and thermodynamic consistency at the expense of speed. The Helmholtz EOS in FLASH provides a table of the Helmholtz free energy (hence the name) and makes use of a thermodynamically consistent interpolation scheme to calculate the thermodynamic quantities, obviating the need to perform the complex calculations required of the above formalism during the course of a simulation. The interpolation scheme uses a bi-quintic Hermite interpolant, and the result is an accurate EOS that performs reasonably well.
| (61) |
| (62) |
We note that the Helmholtz free energy table is constructed only for the electron-positron plasma, it is a 2-dimensional function of density and temperature, i.e. F(r,T), and it is made with [`(A)] = [`(Z)] = 1 (pure hydrogen). One reason for not including contributions from photons and ions in the table is that these components of the Helmholtz EOS are very simple (Equations 54-55) and one doesn't need fancy table look-up schemes to evaluate simple analytical functions. A more important reason for only constructing an electron-positron EOS table with Ye = 1 is that the 2-dimensional table is valid for any composition. Separate planes for each Ye are not necessary (or desirable), since simple multiplication by Ye in the appropriate places gives the desired composition scaling. If photons and ions were included in the table, then this valuable composition independence would be lost, and a 3-dimensional table would be necessary.
The Helmholtz EOS has been subjected to considerable analysis and testing (Timmes & Swesty 2000), and particular care was taken to reduce the numerical error introduced by the thermodynamical models below the formal accuracy of the hydrodynamics algorithm (Fryxell, et al. 2000; Timmes & Swesty 2000). The physical limits of the Helmholtz EOS are 10-10 < r < 1011 (g/cm3) and 104 < T < 1011 (K). As with the gamma-law EOS, the Helmholtz EOS provides both forward and backward relations. In the case of the forward relation (r, T, given along with the composition) the table lookup scheme and analytic formulae directly provide relevant thermodynamic quantities. In the case of the backward relation (r, e, and composition given) the routine performs a Newton-Rhaphson iteration to determine temperature.
9.2.2.1 The block interface, eos3d After each update from the hydrodynamics or burning, it is necessary to update the pressure and temperature for the entire block. The eos3d function is optimized for updating all of the zones in a single block. This function will always take the internal energy, density, and temperature as input, and use them to find the temperature, pressure, and adiabatic indices for this thermodynamical state. This information is obtained through database calls for all of the zones in the block. eos3d takes two arguments:
eos3d(iblock, iflag)where iblock is the block number to operate on, and iflag specifies which region of the block to update. Setting iflag to 0 will update all of the interior zones (i.e. exclude guardcells). A value of 7 will update all the zones in a block (interior zones + guardcells). Values between 1 and 6 update the individual regions of guardcells (upper and lower regions in the three coordinate directions).
For some equations of state, it is necessary to perform a Newton-Raphson iteration to find the temperature and pressure corresponding to the internal energy, density, and composition, because the equation of state is more naturally state in terms of temperature and density. In these cases, eos3d will do the necessary root finding up to a tolerance defined in the function (typically 1×10-8).
9.2.2.2 The vector interface, eos1d
An alternate interface to the equation of state is provided by eos1d. This function operates on a vector of input, taking density, composition, and either internal energy or temperature as input, and returning pressure, gc, and either the temperature or internal energy (which ever was not used as input).
In eos1d, all the input is taken from the argument list:
eos1d (input, kbegin, kend, rho, tmp, p, ei, gamc, xn, q, qn)Here, input is an integer flag that specifies how whether the temperature (input = 1) or internal energy (input = 2) compliment the density and composition as input. Two other integers, kbegin and kend specify the beginning and ending indices in the input vectors to operate on. The arrays rho, tmp, p, ei, gamc are of length q, and contain the density, tmperature, pressure, internal energy, and gc respectively. The array xn(q,qn) contains the composition (for qn fluids) for all of the input zones.
This equation of state interface is useful for use in initializing a problem. The user is given direct control over where the input comes from, and where it ultimately is stored, since everything is passed through the argument list. This is more efficient than calling the equation of state routine directly on a point by point basis, since the pipelining can be taken advantange of for better cache performance.
9.2.2.3 The point interface, eos
The eos interface provides the most information and flexibility. No assumptions about the layout of the data are made. This function simply takes density, composition, and either temperature or internal energy as input, and returns a host of thermodynamic quantities. Most of the information provided here is not provided anywhere else, such as the electron pressure, degeneracy parameter, and thermodynamic derivatives. The interface is:
eos(dens, temp, pres, ener, xn, abar, zbar, dpt, dpd, det, ded, & gammac, pel, ne, eta, input)The arguments dens, temp, pres, and ener are the density, temperature, pressure, and internal energy respectively. xn is a vector containing the composition (the length of this vector is ionmax, supplied by the common module. abar and zbar are the average atomic mass and proton number, which are returned at the end of the call. Four thermodynamic derivatives are provided, pressure with respect to temperature ( dpt) and density (dpd), and energy with respect to temperature (det) and density (ded). Finally, gc ( gammac), the electron pressure (pel), electron number density (ne), and electron degeneracy pressure (eta) are also returned. The interger input specifies whether temperature (input = 1) or internal energy (input = 2) are used together with the density and composition as input.
There are very few runtime parameters used with these equations of state. The gamma-law EOS takes only one parameter, the value of gamma used to relate the internal energy and pressure (see table 12).
Variable | Type | Default | Description |
gamma | real | 1.6667 | Ratio of specific heats for the gas (g) |
The helmholtz module also takes a single runtime parameter, whether or not to apply Coulomb corrections. In some regions of the r-T plane, the approximations made in the Coulomb corrections may be invalid, and result in negative pressures. When the parameter coulomb_mult is set to zero, the Coulomb corrections are not applied (see Table 13).
Variable | Type | Default | Description |
coulomb_mult | real | 1.0 | Multiplication factor for Coulomb corrections. |
The helmholtz EOS requires an input file helm_table.dat which contains the table for the electron contributions. This table is currently (ASC)I for portability purposes. When the table is first read in a binary version called helm.table.bda.t is created. This can be used for subsequent restarts on the same machine, but may not be portable across platforms.
The composition module sets up the different compositions needed by FLASH. In general, there is one composition for each of the burners located in source/source_terms/burn/, as well as a generic fuel and ash composition. You will only need to write your own module if you wish to carry around different numbers or types of fluid than any of the predefined modules. These modules set up the names of the fluid (both a long name, recognized by the main FLASH database, and a short name that can be queried through the multifluid database) as well as their general properties. You are required to use the module corresponding to the burn module in your problem.
The Config file in each composition directory specifies the number of fluids. The general syntax is:
NUMSPECIES 2This example sets up 2 fluids. This file is read by setup and used to initialize the ionmax parameter in FLASH. This parameter is publically available in the variable database, and can be used to initialize arrays to the number of fluids tracked by FLASH.
Each composition directory also contains a file named init_mat.F90 that sets the properties of each fluid. This routine is called at the start of program execution by init_flash. The general syntax of this file is:
subroutine init_materials use multifluid_database, ONLY: init_mfluid_db, & add_fluid_to_db, n_fluids use common, ONLY: ionmax implicit none call init_mfluid_db (ionmax) if (n_fluids == 2) then call add_fluid_to_db ("fuel", "f", A=1., Z=1., Eb=1.) call add_fluid_to_db ("ash", "a", A=1., Z=1., Eb=1.) else call abort_flash("init_mat: fuel+ash requires two fluids!") endif return endHere we initialize the multifluid database through the call to init_mfluid_db. The value of ionmax is supplied through the common database. Next, each fluid is added to the database through the calls to add_fluid_to_db, specifying the full name and short name, the atomic mass, proton number, and binding energy. The atomic mass and proton numbers are used in the equation of states, and are accessed via multifluid database calls.
The example setup discussed in Subsection 16.1 demonstrates how to setup a problem with two fluids (using the fuel+ash module). The same accessor methods that are used to store the solution data are used to store the fluid abundances in each zone.
Below we summarize the different compositions.
longname | short name | mass | charge | binding energy |
helium-4 | He4 | 4. | 2. | 28.29603 |
carbon-12 | C12 | 12. | 6. | 92.16294 |
oxygen-16 | O16 | 16. | 8. | 127.62093 |
neon-20 | Ne20 | 20. | 10. | 160.64788 |
magnesium-24 | Mg24 | 24. | 12. | 198.25790 |
silicon-28 | Si28 | 28. | 14. | 236.53790 |
sulfur-32 | S32 | 32. | 16. | 271.78250 |
argon-36 | Ar36 | 36. | 18. | 306.72020 |
calcium-40 | Ca40 | 40. | 20. | 342.05680 |
titanium-44 | Ti44 | 44. | 22. | 375.47720 |
chromium-48 | Cr48 | 48. | 24. | 411.46900 |
iron-52 | Fe52 | 52. | 26. | 447.70800 |
nickel-56 | Ni56 | 56. | 28. | 484.00300 |
longname | short name | mass | charge | binding energy |
hydrogen-1 | H1 | 1. | 1. | 0.00000 |
helium-3 | He3 | 3. | 2. | 7.71819 |
helium-4 | He4 | 4. | 2. | 28.29603 |
carbon-12 | C12 | 12. | 6. | 92.16294 |
nitrogen-14 | N14 | 14. | 7. | 104.65998 |
oxygen-16 | O16 | 16. | 8. | 127.62093 |
neon-20 | Ne20 | 20. | 10. | 160.64788 |
magnesium-24 | Mg24 | 24. | 12. | 198.25790 |
silicon-28 | Si28 | 28. | 14. | 236.53790 |
sulfur-32 | S32 | 32. | 16. | 271.78250 |
argon-36 | Ar36 | 36. | 18. | 306.72020 |
calcium-40 | Ca40 | 40. | 20. | 342.05680 |
titanium-44 | Ti44 | 44. | 22. | 375.47720 |
chromium-48 | Cr48 | 48. | 24. | 411.46900 |
iron-52 | Fe52 | 52. | 26. | 447.70800 |
iron-54 | Fe54 | 54. | 26. | 471.76960 |
nickel-56 | Ni56 | 56. | 28. | 484.00300 |
neutrons | n | 1. | 0. | 0.00000 |
protons | p | 1. | 1. | 0.00000 |
longname | short name | mass | charge | binding energy |
fuel | f | 1. | 1. | 1.0 |
ash | a | 1. | 1. | 1.0 |
iso7 provides a very minimal alpha-chain, useful for problems that do not have enough memory to carry a larger set of isotopes. This is the complement to the iso7 reaction network.
longname | short name | mass | charge | binding energy |
helium-4 | He4 | 4. | 2. | 28.29603 |
carbon-12 | C12 | 12. | 6. | 92.16294 |
oxygen-16 | O16 | 16. | 8. | 127.62093 |
neon-20 | Ne20 | 20. | 10. | 160.64788 |
magnesium-24 | Mg24 | 24. | 12. | 198.25790 |
silicon-28 | Si28 | 28. | 14. | 236.53790 |
nickel-56 | Ni56 | 56. | 28. | 484.00300 |
ppcno is a composition group suitable for pp/CNO reactions. It is required by the ppcno reaction network.
longname | short name | mass | charge | binding energy |
hydrogen-1 | H1 | 1. | 1. | 0.00000 |
hydrogen-2 | H2 | 2. | 1. | 2.22500 |
helium-3 | He3 | 3. | 2. | 7.71819 |
helium-4 | He4 | 4. | 2. | 28.29603 |
lithium-7 | Li7 | 7. | 3. | 39.24400 |
beryllium-7 | Be7 | 7. | 4. | 37.60000 |
boron-8 | B8 | 8. | 5. | 37.73800 |
carbon-12 | C12 | 12. | 6. | 92.16294 |
carbon-13 | C13 | 13. | 6. | 97.10880 |
nitrogen-13 | N13 | 13. | 7. | 94.10640 |
nitrogen-14 | N14 | 14. | 7. | 104.65998 |
nitrogen-15 | N15 | 15. | 7. | 115.49320 |
oxygen-15 | O15 | 15. | 8. | 111.95580 |
oxygen-16 | O16 | 16. | 8. | 127.62093 |
oxygen-17 | O17 | 17. | 8. | 131.76360 |
oxygen-18 | O18 | 18. | 8. | 139.80800 |
fluorine-17 | F17 | 17. | 9. | 128.22120 |
fluorine-18 | F18 | 18. | 9. | 137.37060 |
fluorine-19 | F19 | 19. | 9. | 147.80200 |
Internal energy may be transported from warm regions into colder material by collisional and radiative processes. At large densities and cold temperatures, thermal transport by conduction dominates over the radiative processes. At small densities and hot temperatures, radiative processes dominate the transport of thermal energy. At intermediate densities and temperatures, both conductive and radiative processes contribute. As such, both radiative and conductive transport processes need to be considered.
FLASH 2.0 provides one module for computing the opacity of stellar material (Timmes 2000; Timmes & Brown 2002). This module uses analytic fits from Iben (1975) and Christy (1966) for the radiative opacity when all processes other than electron scattering are considered. An approximation formula from Weaver et al. (1978) for the Compton opacity, which includes a cutoff for frequencies less than the plasma frequency, is then added to form the total radiative opacity. Analytic fits from Iben (1975) are used for the thermal conductivity in the non-degenerate regime. In the degenerate regime, the thermal conductivity formalism of Yakovlev & Urpin (1980) is used. A smooth and continuous interpolation function joins the thermal conductivity expressions in the degenerate and non-degenerate regimes in the transition regions. Contributions from ion-electron, electron-electron, and phonon-electron scattering are summed to form the total thermal conductivity. Potekhin, Chabrier & Yakovlev (1997) give an approximation formula for the electron-electron interaction integral J(y), which is more complete than the approximation formula given by Timmes (1992), has been adopted. The radiative opacity is converted to an equivalent conductivity by srad = 4 a cT3 / (3 rkrad) before forming the total thermal conductivity.
The nuclear burning module uses a sparse-matrix semi-implicit ordinary
differential equation (ODE)
solver to calculate the nuclear burning rate and update the fluid
variables accordingly (Timmes 1999). The primary interface routines
for this module are init_burn(), which calls routines to set up the
nuclear isotope tables needed by the module; and burn(), which
calls the ODE solver and updates the hydrodynamical variables in a
single row of a single AMR block.
Variable | Type | Default | Description |
tnucmin | real | 1.1×108 | Minimum temperature in K for burning to be allowed |
tnucmax | real | 1.0×1012 | Maximum temperature in K for burning to be allowed |
dnucmin | real | 1.0×10-10 | Minimum density (g/cm3) for burning to be allowed |
dnucmax | real | 1.0×1014 | Maximum density (g/cm3) for burning to be allowed |
ni56max | real | 0.4 | Maximum Ni56 mass fraction for burning to be allowed |
For some physical processes (i.e. detonations), it is unphysical for burning to take place in shocks. The burner module includes a multidimensional shock detection algorithm that can be used to prevent burning in shocks. If the shock_burning parameter is set to .FALSE., then this algorithm is used to detect shocks in the burn_block function, and switch off the burning in shocked zones.
Currently the shock detection supports Cartesian and 2-dimensional cylindrical coordinates. The basic algorithm is to compare the jump in pressure in the direction of compression (determined by looking at the velocity field) with a shock parameter (typically 1/3). If the total velocity divergence is negative and the relative pressure jump across the compression front is larger than the shock parameter, then a zone is marked as shocked.
This computation is done on a block by block basis. It is important that the velocity and pressure variables have up-to-date guardcells, so a guardcell call is done for the burners only if we are detecting shocks (i.e. shock_burning = .FALSE.).
Modelling thermonuclear flashes typically requires the energy generation rate due to nuclear burning over a large range of temperatures, densities and compositions. The average energy generated or lost over a period of time is found by integrating a system of ordinary differential equations (the nuclear reaction network) for the abundances of important nuclei and the total energy release. In some contexts, such as Type II supernova models, the abunda nces themselves are also of interest. In either case, the coefficients that appear in the equations typically are extremely sensitive to temperature. The resulting stiffness of the system of equations requires the use of an implicit time integration scheme.
A user can choose between two implicit integration methods and two linear algebra packages in FLASH. The runtime parameter ode_steper controls which integration method is used in the simulation. The choice ode_steper = 1 is the default choice, and invokes a Bader-Deuflhard scheme. The choice ode_steper = 2 invokes a Kaps-Rentrop scheme. The runtime parameter algebra" controls which linear algebra package is used in the simulation. The choice algebra = 1 is the default choice, and invokes the sparse matrix MA28 package. The choice algebra = 2 invokes the GIFT linear algebra routines. While any combination of the integration methods and linear algebra packages will produce correct answers, some combinations may execute more efficiently than other combinations for certain types of simulations. No general rules have been found for which combination is the best for a given simulation. Which combination is the most efficient depends on the time-step being taken, the spatial resolution of the model, the values of the local thermodynamic variables, and the composition. Experiment with the various combinations!
Timmes (1999) reviewed several methods for solving stiff nuclear reaction networks, providing the basis for the reaction network solvers included with FLASH. The scaling properties and behavior of three semi-implicit time integration algorithms (a traditional first-order accurate Euler method, a fourth-order accurate Kaps-Rentrop method, and a variable order Bader-Deuflhard method) and eight linear algebra packages (LAPACK, LUDCMP, LEQS, GIFT, MA28, UMFPACK, and Y12M) were investigated by running each of these 24 combinations on seven different nuclear reaction networks (hard-wired 13- and 19-isotope networks and soft-wired networks of 47, 76, 127, 200, and 489 isotopes). Timmes' analysis suggested that the best balance of accuracy, overall efficiency, memory footprint, and ease-of-use was provided by the two integration methods (Bader-Deuflhard and Kaps-Rentrop) and the two linear algebra packages (MA28 and GIFT) that are provided with the FLASH code.
We begin by describing the equations solved by the nuclear burning
module. We consider material which may be described by a density
r and a single temperature T and contains a number of isotopes
i, each of which has Zi protons and Ai nucleons (protons +
neutrons). Let ni and ri denote the number and mass density,
respectively, of the ith isotope, and let Xi denote its mass
fraction, so that
| (63) |
| (64) |
| (65) |
The general continuity equation for the ith isotope is given in
Lagrangian formulation by
| (66) |
| (67) |
The case Vi º 0 is important for two reasons. First,
mass diffusion is often unimportant when compared to other transport
process such as thermal or viscous diffusion (i.e., large Lewis
numbers and/or small Prandtl numbers). S
uch a situation obtains, for
example, in the study of laminar flame fronts propagating through the
quiescent interior of a white dwarf. Second, this case permits the
decoupling of the reaction network solver from the hydrodynamical
solver through the use of operator splitting, greatly simplifying the
algorithm. This is the method used by the default FLASH distribution.
Setting Vi º 0 transforms Equation () into
| (68) |
| (69) |
Because of the highly nonlinear temperature dependence of the nuclear reaction rates, and because the abundances themselves often range over several orders of magnitu de in value, the values of the coefficients which appear in Equations (68) and (69) can vary quite significantly. As a result, the nuclear reaction network equations are ``stiff.'' A system of equations is stiff when the ratio of the maximum to the minimum eigenvalue of the Jacobian matrix [(J)\tilde] º ¶f/¶y is large and imaginary. This means that at least one of the isotopic abundances changes on a much shorter timescale than another. Implicit or semi-implicit time integration methods are generally necessary to avoid following this short-timescale behavior, requiring the calculation of the Jacobian matrix.
It is instructive at this point to look at an example of how
Equation (68) and the associated Jacobian matrix are
formed. Consider the 12C(a,g)16O reaction,
which competes with the triple-a reaction during helium burning
in stars. The rate R at which this
reaction proceeds is critical for
evolutionary models of massive stars since it determines how much of
the core is carbon and how much of the core is oxygen after the
initial helium fuel is exhausted. This reaction sequence contributes
to the right-hand side Equation (69) through the terms
|
|
The FLASH code distribution includes two reaction networks. The 13-isotope a-chain plus heavy-ion reaction network is suitable for most multi-dimensional simula tions of stellar phenomena where having a reasonably accurate energy generation rate is of primary concern. The 19-isotope reaction network has the same a-chain and heavy-ion reactions as the 13-isotope network, but it includes additional isotopes to accommodate some types of hydrogen burning (PP chains and steady-state CNO cycles), along with some aspects of photodisintegration into 54Fe. This 19 isotope reaction network is described in Weaver, Zimmerman, & Woosley (1978). Both the networks supplied with FLASH are examples of ``hard-wired'' reaction networks, where each of the reaction sequences are carefully entered by hand. This approach is suitable for small networks when minimizing the CPU time required to run the reaction network is a primary concern, although it suffers the disadvantage of inflexibility.
10.1.2.2 Two linear algebra packages
As we've seen in the previous section, the Jacobian matrices of nuclear reaction networks tend to be sparse, and they become more sparse as the number of isotopes increases. Since implicit or semi-implicit time integration schemes generally require solving systems of linear equations involving the the Jacobian matrix, taking advantage of the sparsity can significantly reduce the CPU time required to solve the systems of linear equations.
The MA28 sparse matrix package used by FLASH is described by Duff, Erisman, & Reid (1986). The MA28 package, which has been described as the ``Coke classic'' of sparse linear algebra packages, uses a direct - as opposed to iterative - method for solving linear systems. Direct methods typically divide the solution of [(A)\tilde] ·x = b into a symbolic LU decomposition, numerical LU decomposition, and a backsubstitution phase. In the symbolic LU decomposition phase the pivot order of a matrix is determined, and a sequence of decomposition operations which minimize the amount of fill-in is recorded. Fill-in refers to zero matrix elements which become nonzero (e.g., a sparse matrix times a sparse matrix is generally a denser matrix). The matrix is not decomposed; only the steps to do so are stored. Since the nonzero pattern of a chosen nuclear reaction network does not change, the symbolic LU decomposition is a one-time initialization cost for reaction networks. In the numerical LU decomposition phase, a matrix with the same pivot order and nonzero pattern as a previously factorized matrix is numerically decomposed into its lower-upper form. This phase must be done only once for each set of linear equations. In the backsubstitution phase, a set of linear equations is solved with the factors calculated from a previous numerical decomposition. The backsubstitution phase may be performed with as many right-hand sides as needed, and not all of the right-hand sides need to be known in advance.
MA28 uses a combination of nested dissection and frontal envelope decomposition to minimize fill-in during the factorization stage. An approximate degree update algorithm that is much faster (asymptotically and in practice) than computing the exact degrees is employed. One continuous real parameter sets the amount of searching done to locate the pivot element. When this parameter is set to zero, no searching is done and the diagonal element is the pivot, while when set to unity, complete partial pivoting is done. Since the matrices generated by reaction networks are usually diagonally dominant, the routine is set in FLASH to use the diagonal as the pivot element. Several test cases showed that using partial pivoting did not make a significant accuracy difference, but were less efficient since a search for an appropriate pivot element had to be performed. MA28 accepts the nonzero entries of the matrix in the (i, j,ai,j) coordinate system, and typically uses uses 70-90% less storage than storing the full dense matrix.
GIFT is a program which generates Fortran subroutines for solving a system of linear equations by Gaussian elimination (Gustafson, Liniger, & Wiiloughby 1970; Müller 1997). The full matrix [(A)\tilde] is reduced to upper triangular form, and backsubstitution with the right-hand side b yields the solution to [(A)\tilde] ·x = b. GIFT generated routines skip all calculations with matrix elements that are zero; in this restricted sense GIFT generated routines are sparse, but the storage of a full matrix is still required. It is assumed that the pivot element is located on the diagonal and no row or column interchanges are performed, so GIFT generated routines may become unstable if the matrices are not diagonally dominant. These routines must decompose the matrix for each right-hand side in a set of linear equations. GIFT writes out (in Fortran code) the sequence of Gaussian elimination and backsubstitution without any do loop constructions on the matrix A(i,j). As a result, the routines generated by GIFT can be quite large. For the 489 isotope network discussed by Timmes (1999) GIFT generated ~ 5.0×107 lines of code! Fortunately, for small reaction networks (less than about 30 isotopes), GIFT generated routines are much smaller and generally faster than other linear algebra packages.
As discussed above, but which bears repeating, the FLASH runtime parameter algebra controls which linear algebra package is used in the simulation. The choice algebra = 1 is the default choice, and invokes the sparse matrix MA28 package. The choice algebra = 2 invokes the GIFT linear algebra routines.
10.1.2.3 Two time integration methods
One of the time integration methods used by FLASH for evolving the
reaction networks is a 4th-order accurate Kaps-Rentrop method. In
essence, this method is an implicit Runge-Kutta algorithm. The
reaction network is advanced over a time step h according to
| (72) |
|
The bi, g, aij, and cij in Eqs. (10) and (11) are fixed constants of the method. An estimate of the accuracy of the integration step is made by comparing a third-order solution with a f ourth-order solution, which is a significant improvement over the basic Euler method. The minimum cost of this method - which applies for a single time step that meets or exceedes a specified integration accuracy - is one Jacobian evaluation, three evaluations of the right-hand side, one matrix decomposition, and four backsubstitutions. Note that Equation (11) represents a staged set of linear equations (D4 depends on D3 ¼ depends on D1). Not all of the right-hand sides are known in advance. This general feature of higher-order integration methods impacts the optimal choice of a linear algebra package. The fourth-order Kaps-Rentrop routine used in FLASH is combination of the routine GRK4T given by Kaps & Rentrop (1979) and the routine STIFF given by Press et al. (1996).
Another time integration method used by FLASH for evolving the reaction
networks is the variable order Bader-Deuflhard method (e.g., Bader &
Deuflhard 1983; Press et al. 1992). The reaction network is advanced
over a large time step H from yn to yn+1 by the
following sequence of matrix equations. First,
|
|
|
As discussed above, but which bears repeating, the FLASH runtime parameter ode_steper controls which integration method is used in the simulation. The choice ode_steper = 1 is the default choice, and invokes the variable order Bader-Deuflhard scheme. The choice ode_steper = 2 invokes the fourth order Kaps-Rentrop scheme.
10.1.2.4 Energy generation rates and reaction rates
The instantaneous energy generation rate is given by the sum
| (80) |
| (81) |
The tabulation of Caughlan & Fowler (1988) is used in FLASH for most of the key nuclear reaction rates. Modern values for some of the reaction rates were taken from the reaction rate library of Hoffman (2001, priv. comm.). A user can choose between two reaction rate evaluations in FLASH. The runtime parameter use_table controls which reaction rate evaluation method is used in the simulation. The choice use_table = 0 is the default choice, and evaluates the reaction rates from analytical expressions. The choice use_table = 1 evaluates the reactions rates from table interpolation. The reaction rate tables are formed on-the-fly from the analytical expressions. Tests on one-dimensional detonations and hydrostatic burnings suggest there are no major differences in the abundance levels if tables are used instead of the analytic expressions; we find less than 1% differences at the end of long time-scale runs. Table interpolation is about 10 times faster than evaluating the analytic expressions, but the speedup to FLASH is more modest, a few percent at best, since reaction rate evaluation never dominates in a real production run.
Finally, nuclear reaction rate screening effects as formulated by Wallace et al. (1982), and decreases in the energy generation rate [(e)\dot]nuc due to neutrino losses as given by Itoh et al. (1996) are included in FLASH.
10.1.2.5 Temperature-based timestep limiting
The hydrodynamics methods implemented in FLASH are explicit, so a timestep
limiter must be used to ensure the stability of the numerical solution.
The standard CFL limiter is always used when a hydrodynamics module is
included in FLASH. This constraint does not allow any information travel
more than 1 computational zone per timestep. The timestep is the minimum of
| (82) |
When coupling burning with the hydrodynamics, the CFL timestep may so large compared to the burning timescales, that the nuclear energy release in a zone may exceed the existing internal energy in that zone. When this happens, the two operations (hydrodynamics and nuclear burning) become decoupled.
To help fix this problem, it is sometimes useful to step along at a
timestep determined by the change in temperature in a zone. FLASH
includes a temperature based timestep limiter that tries to constrain
the change in temperature in a zone to be less than a user defined
parameter. To use this limiter, set itemp_limit = 1, and
specify the fractional temperature change you are willing to tolerate,
temp_factor. While there is no guarantee that the temperature change
will be smaller than this, since the timestep was already taken by the time
this was computed, this method is successful in restoring coupling between
the hydrodynamics and burning operators. This timestep will be computed as
| (83) |
The addition of driving terms in a hydrodynamical simulation can be a useful feature, for example, for generating turbulent flows, or for simulating the addition of power on larger scales (eg, supernova feedback in the interstellar medium). The stirring module is a module which directly adds a divergence-free, time-correlated `stirring' velocity at selected modes in the simulation.
The time-correlation is important for modelling realistic driving forces. Most large-scale driving forces are time-correlated, rather than white-noise; for instance, turbulent stirring from larger scales will be correlated on timescales related to the lifetime of an eddy on the scale of the simulation domain. This time correlation will lead to coherent structures in the simulation which will be absent with white-noise driving.
For each mode, at each time steps six seperate phases (real and imaginary in each of the three spacial dimensions) are evolved by an Ornstein-Uhlenbeck (OU) random process. The OU process is a zero-mean process which at each step `decays' the previous value by an exponental e([(Dt)/(t)]) and adds a gaussian random variable with a given variance. For our purposes, since the OU process represents a velocity, the variance is chosen to be the square root of the specific energy input rate (set by the runtime parameter st_energy) divided by the decay time, t (st_decay).
By evolving the phases of the stirring modes in Fourier space, imposing a divergence-free condition is relatively straightforward. At each timestep, the solenoidal component of the velocities is projected out, leaving only the non-compressional modes to add to the velocities.
The velocities are then converted to physical space by a direct Fourier transform - e.g., actually doing the sum of sin and cos terms. Since most drivings will involve a fairly small number of modes, this is more efficient than an FFT since the FFT would involve large numbers of modes (equal to six times the number of cells in the domain) of which the vast majority would have zero amplitude.
Variable | Type | Default | Description |
istir | integer | 1 | Do we `turn on' stirring for this run? 1=yes, 0=no. |
st_seed | integer | 2 | Seed for the random number generator. |
st_energy | real | .01 | (RMS) specific energy/time/mode stirred in. |
st_decay | real | .1 | Decay time for OU random numbers; correlation time of the stirring. |
st_stirmax | real | 62.8 | wavenumber corresponding to the smallest physical scale that stirring will occur on. |
st_stirmin | real | 31.4 | wavenumber corresponding to the largest scale that stirring will occur on. |
The gravity module supplied with
FLASH computes gravitational source terms for the code.
These source terms can take the form of the gravitational potential
f(x) or the gravitational acceleration,
| (84) |
| (85) |
As distributed, FLASH includes the following externally applied gravitational fields. Each provides the acceleration vector g(x) directly, without using the gravitational potential f(x).
The self-gravity algorithms supplied with FLASH solve the Poisson equation
(85) for the gravitational potential f(x).
The modules implementing these algorithms can also return the acceleration
field g(x); this is computed by finite-differencing the
potential using the expressions
|
11.1.2.1 Multipole Poisson solver
The multipole Poisson solver is appropriate for spherical or nearly-spherical mass distributions, and it only accepts isolated boundary conditions. At present it only works in three-dimensional Cartesian coordinates.
The multipole algorithm consists of the following steps.
First, find the center of mass xcm:
| (87) |
| (88) |
| (89) |
|
| (91) |
|
|
|
|
In practice the above procedure must take account of the fact that the
gravity module assumes both the density and the potential to be zone-averaged
quantities, discretized on a block-structured mesh with varying zone size.
Also, because of the radial dependence of the multipole moments of the
density, these moments must be tabulated as functions of distance from
xcm, with an implied discretization. The solver allocates
storage for moment samples spaced a distance D apart in radius:
|
Because we use a Cartesian grid, determining the contribution of individual
zones to the tabulated moments requires some care. To reduce the error
caused by the grid geometry, in each zone ijk
we establish a subgrid consisting of N¢ points at the locations
x¢i¢j¢k¢, where
|
|
| (108) |
Another way to reduce Cartesian grid errors when using the multipole solver is to modify the AMR refinement criterion to refine all blocks containing the center of mass (in addition to other criteria that may be used, such as the second-derivative criterion supplied with PARAMESH). This ensures that the center-of-mass point is maximally refined at all times, further restricting the volume which contributes errors to the moments because r¢ ~ Dx.
The default value of N¢ is 2; note that large values of this parameter very quickly increase the amount of time required to evaluate the multipole moments (as N¢3). In order to speed up the moment summations, the sines and cosines in equations (104) - (107) are evaluated using trigonometric recurrence relations, and the factorials are pre-computed and stored at the beginning of the run.
When computing the zone-average potential, we again employ a subgrid, but
here the subgrid points fall on zone boundaries to improve the continuity
of the result. Using N¢¢+1 subgrid points per dimension, we have
|
| (112) |
11.1.2.2 Multigrid Poisson solver
The multigrid Poisson solver is appropriate for general mass distributions and can solve problems with periodic or isolated boundary conditions. The algorithm distributed with FLASH is based on a multilevel refinement scheme described by Martin and Cartwright (1996). Isolated boundary conditions are implemented via a method based on James' (1978) algorithm.
Multilevel refinement algorithms (Brandt 1977; Trottenberg, Oosterlee, & Schüller 2001) solve elliptic equations such as the Poisson equation by accelerating the convergence of relaxation methods. The latter (e.g., Jacobi, Gauss-Seidel, SOR) are straightforward but converge very slowly because they accomplish the global coupling implied by an elliptic equation by a series of iterations that communicate information from one side of the grid to the other one zone at a time. Hence their convergence rate (fractional reduction in error per iteration) decreases with increasing grid size. Modal analysis shows that the longest-wavelength components of the error require the most iterations to decrease to a given level. By performing iterations on a sequence of increasingly coarser grids, multigrid algorithms bring all wavelengths into convergence at the same rate. This works because long wavelengths on a fine mesh appear to be short wavelengths on a coarse mesh.
Adaptive mesh refinement (AMR) provides many benefits in conjunction with a multigrid solver. Where errors are unlikely to have short-wavelength components it makes sense to avoid using fine grids, thus reducing storage requirements and the cost of relaxations on fine levels. The AMR package manages the multilevel mesh data structures and can handle all parallel communication, freeing the multigrid solver from such details. The AMR package supplies many of the basic functions required by multigrid algorithms in addition to the mesh data structure, including prolongation, restriction, and boundary condition updates. Therefore we use a mesh hierarchy defined by the AMR package. Note that, whereas with hydrodynamics it is preferable to refine regions containing fluid discontinuities, with gravity it is instead preferable to refine narrow peaks in the density field, since the Poisson Equation (85) requires the curvature of the solution (the potential) to undergo the largest small-scale fluctuations at such peaks. Discontinuities can be detected using the second-derivative criterion supplied with PARAMESH. However, when solving self-gravitating problems using the multigrid module, it may be desirable to add to this criterion one which refines blocks based on their mean density contrast with respect to a fixed reference density. This is illustrated by the jeans problem setup.
AMR does introduce complications, however. Because the mesh hierarchy contains jumps in refinement, it is necessary to interpolate when setting guard cell values for fine blocks adjoining coarser blocks. As Martin and Cartwright point out, this requires an interpolation scheme with at least the same order of accuracy as the finite differencing scheme used. Thus quadratic interpolants must be used with the Poisson equation. However, unless the first derivative of the solution is also matched across jumps in refinement, unphysical forces will be produced at such boundaries, and the multigrid solver will fail to converge. Since we regard the solution on the finer level as being of higher quality than the solution on the coarser level, in such situations we allow the fine grid to determine the value of the first derivative on the boundary.
Before describing the algorithm, let us first define some terms.
We work with approximations [(f)\tilde](x) to the solution
f(x).
The residual is a measure of the error in [(f)\tilde](x);
it is given by
|
| (114) |
| (115) |
Difference operators approximating Ñ2 on each grid level are
defined for relaxation and for computing the residual.
On level l, which has zone spacings Dxl,
Dyl, and Dzl in the x-, y-, and
z-directions, we use
|
|
|
Here are the steps in the multigrid algorithm:
The external boundary conditions accepted by the multigrid algorithm are Dirichlet, given-value, and periodic boundaries. However, often isolated boundary conditions are desired. This means that the density r is assumed to be zero outside of the computational volume, and that the potential f tends smoothly to zero at arbitrarily large distances. In order to accomodate this type of boundary condition we use a variant of James' (1978) method. The steps are as follows:
The gravitational field couples to the Euler equations only through the
momentum and energy equations. If we define the total energy density as
| (128) |
|
|
To include the effects of gravity in your FLASH executable, include the line
INCLUDE gravity/sub-module[/algorithm]in your Modules file when you configure the code with setup. The available sub-modules include constant, planepar, poisson, and ptmass. If you are using the Poisson solver to compute the gravitational field, you may also specify an algorithm, currently multipole or multigrid. The function and usage of each of the gravity sub-modules are described in the following sections.
Note that to use any of the gravitational field routines in your code you must use-associate the module gravity. Most users will be concerned only with the following routines supplied by gravity:
j, k | (integer) | Row indices transverse to the sweep direction |
xyzswp | (integer) | The sweep direction (sweep_x, sweep_y, sweep_z) |
block_no | (integer) | The local block identifier |
ivar | (integer) | The solution variable database key to use as the potential, if applicable. |
grav(:) | (real) | Array to receive the component of the acceleration parallel to the sweep direction |
nzn8 | (integer) | The number of zones to update in grav() |
The constant sub-module implements a spatially and temporally constant gravitational field parallel to one of the coordinate axes. The magnitude and direction of this field are set at runtime using the parameters listed in Table 21.
Variable | Type | Default | Description |
gconst | real | -981 | Gravitational acceleration |
gdirec | string | ``x'' | Direction of acceleration vector (``x'', ``y'', ``z'') |
The planepar sub-module implements a time-constant gravitational field that is parallel to one of the coordinate axes and falls off with the square of the distance from a fixed location. The field is assumed to be generated by a point mass. A finite softening length may optionally be applied. This type of field is useful when the computational domain is large enough in the direction radial to the field source that the field is not approximately constant, but the domain's dimension perpendicular to the radial direction is small compared to the distance to the source, so that the angular variation of the field direction may be ignored. The planepar field is cheaper to compute than the ptmass field since no fractional powers of the distance are required. The runtime parameters describing this field are listed in Table 22.
Variable | Type | Default | Description |
ptmass | real | 1.E4 | Mass of field source |
ptxpos | real | 1 | Position of field source in direction ptdirn |
gravsoft | real | 0.0001 | Gravitational softening length |
ptdirn | integer | 1 | Direction of acceleration vector (1=x, 2=y, 3=z) |
The ptmass sub-module implements the gravitational field due to a point mass at a fixed location. A finite softening length may optionally be applied. The runtime parameters describing the field are listed in Table 23.
Variable | Type | Default | Description |
ptmass | real | 1.E4 | Mass of field source |
ptxpos | real | 1 | x-position of field source |
ptypos | real | -10 | y-position of field source |
ptzpos | real | 0 | z-position of field source |
gravsoft | real | 0.0001 | Gravitational softening length |
The poisson sub-module computes the gravitational field produced by the matter in a simulation. Currently only Newtonian gravity on Cartesian meshes is supported; the potential function produced by this sub-module satisfies Poisson's equation (85). Two different elliptic solvers are supplied with FLASH: a multipole solver, suitable for approximately spherical matter distributions, and a multigrid solver, which can be used with general matter distributions. The multipole solver accepts only isolated boundary conditions, whereas when FLASH 2.0 is formally released the multigrid solver will support both periodic and isolated boundary conditions (only periodic boundaries are supported in the pre-release version). Boundary conditions for the Poisson solver are specified using the grav_boundary_type parameter described in Table 24.
When using potential-based gravity modules it is strongly recommended that you use the second_order (quadratic) interpolants supplied by PARAMESH. This is because the gravitational acceleration is computed using finite differences. If the interpolants supplied by the mesh are not of at least the same order as the differencing scheme used, unphysical forces will be produced at refinement boundaries. Also, using constant or linear interpolants may cause the multigrid solver to fail to converge.
Variable | Type | Default | Description |
grav_boundary_type | string | ``isolated'' | Type of boundary conditions for potential (``isolated'', ``periodic'') |
The poisson sub-module supplies three solution variables, listed in Table 25 (the multigrid solver adds several to this total). See page pageref for an explanation of their meaning.
Variable | Attributes | Description |
gpot | NOADVECT NORENORM NOCONSERVE | Gravitational potential at the current timestep |
gpol | NOADVECT NORENORM NOCONSERVE | Gravitational potential at the previous timestep |
dens | ADVECT NORENORM CONSERVE | Matter density used as the source of the field |
The poisson/multipole sub-module takes only one runtime parameter, the maximum multipole moment to compute. Beware that storage and CPU costs scale roughly as the square of this parameter, so it is best to use this module only for nearly spherical matter distributions. The parameter is described in Table 26.
Variable | Type | Default | Description |
mpole_lmax | integer | 10 | Maximum multipole moment |
The poisson/multigrid sub-module is appropriate for general mass distributions (ie., not necessarily spherical). As of the pre-release version of FLASH 2.0 it supports only periodic boundary conditions, but the official release will also support isolated boundaries. All boundary conditions must be the same, though this is more a limitation of the interface than of the solver itself. The solver works in one, two, or three dimensions.
The runtime parameters which control the multigrid solver are summarized in Table 27. If you wish to increase the accuracy (and hence the execution time) of the solver, the first parameters to change are mgrid_max_residual_norm, which sets the termination condition for V-cycles, and mgrid_smooth_tol, which sets the termination condition for the coarse-grid iteration. Changing the other parameters from their default values is unlikely to help, and may increase execution time. Also, if changing only mgrid_max_iter_change changes the answers you obtain, then either you have set the maximum residual norm too low (comparable to roundoff error on your computer) or there is a problem with the multigrid solver. This is because each successive V-cycle (if implemented correctly) reduces the norm of the residual by roughly the same factor until roundoff is reached. The default settings should be suitable for most applications.
Note that the multigrid solver uses the contents of gpot on entry as the initial guess for the iteration. For self-gravitating hydrodynamics problems, the potential changes very little from timestep to timestep, particularly on large scales. Thus the potential calculation for the first timestep will likely take several times longer than subsequent steps, as the potential from each step is used for the initial guess in the next.
Variable | Type | Default | Description |
mgrid_max_residual_norm | real | 1.E-6 | Maximum ratio of the norm of the residual to that of the right-hand side |
mgrid_max_iter_change | real | 1.E-3 | Maximum change in the norm of the residual from one iteration to the next |
mgrid_max_vcycles | integer | 100 | Maximum number of V-cycles to take |
mgrid_nsmooth | integer | 4 | Number of smoothing iterations to perform on each level |
mgrid_smooth_tol | real | 1.E-6 | Convergence criterion for the smoother |
mgrid_jacobi_weight | real | 0.6666666666667 | Weighting factor for damped Jacobi iteration (1 yields plain Jacobi) |
mgrid_solve_max_iter | integer | 5000 | Maximum number of iterations for solution on coarse grid |
The multigrid solver requires several additional temporary solution variables, which are listed in Table 28. Most of these are of no interest to the end user, but it may occasionally be helpful to inspect mgw2, as this contains the residual on exit from the solver.
Variable | Attributes | Description |
mgw1 | NOADVECT NORENORM NOCONSERVE | Work array |
mgw2 | NOADVECT NORENORM NOCONSERVE | Work array |
mgw3 | NOADVECT NORENORM NOCONSERVE | Work array |
mgw4 | NOADVECT NORENORM NOCONSERVE | Work array |
mgw5 | NOADVECT NORENORM NOCONSERVE | Work array |
To verify that FLASH works as expected, and to debug changes in the code, we have created a suite of standard test problems. Most of these problems have analytical solutions which can be used to test the accuracy of the code. The remaining problems do not have analytical solutions, but they produce well-defined flow features which have been verified by experiments and are stringent tests of the code. The test suite configuration code is included with the FLASH source tree (in the setups/ directory), so it is easy to configure and run FLASH with any of these problems `out of the box.' Sample runtime parameter files are also included.
The Sod problem (Sod 1978) is an essentially one-dimensional flow discontinuity problem which provides a good test of a compressible code's ability to capture shocks and contact discontinuities with a small number of zones and to produce the correct density profile in a rarefaction. It also tests a code's ability to correctly satisfy the Rankine-Hugoniot shock jump conditions. When implemented at an angle to a multidimensional grid, it can also be used to detect irregularities in planar discontinuities produced by grid geometry or operator splitting effects.
We construct the initial conditions for the Sod problem by establishing a
planar interface at some angle to the x and y axes. The fluid is initially
at rest on either side of the interface, and the density and pressure jumps
are chosen so that all three types of flow discontinuity (shock, contact, and
rarefaction) develop. To the ``left'' and ``right'' of the interface we have
| (131) |
In FLASH, the Sod problem (sod) uses the runtime parameters listed in Table 29 in addition to the regular ones supplied with the code. For this problem we use the gamma equation of state module and set the value of the parameter gamma supplied by this module to 1.4. The default values listed in Table 29 are appropriate to a shock with normal parallel to the x-axis which initially intersects that axis at x=0.5 (halfway across a box with unit dimensions).
Variable | Type | Default | Description |
rho_left | real | 1 | Initial density to the left of the interface (rL) |
rho_right | real | 0.125 | Initial density to the right (rR) |
p_left | real | 1 | Initial pressure to the left (pL) |
p_right | real | 0.1 | Initial pressure to the right (pR) |
u_left | real | 0 | Initial velocity (perpendicular to interface) to the left (uL) |
u_right | real | 0 | Initial velocity (perpendicular to interface) to the right (uR) |
xangle | real | 0 | Angle made by interface normal with the x-axis (degrees) |
yangle | real | 90 | Angle made by interface normal with the y-axis (degrees) |
posn | real | 0.5 | Point of intersection between the interface plane and the x-axis |
Figure 10 shows the result of running the Sod problem with FLASH on a two-dimensional grid, with the analytical solution shown for comparison. The hydrodynamical algorithm used here is the directionally split piecewise-parabolic method (PPM) included with FLASH. In this run the shock normal is chosen to be parallel to the x-axis. With six levels of refinement, the effective grid size at the finest level is 2562, so the finest zones have width 0.00390625. At t=0.2 three different nonlinear waves are present: a rarefaction between x » 0.25 and x » 0.5, a contact discontinuity at x » 0.68, and a shock at x » 0.85. The two sharp discontinuities are each resolved with approximately three zones at the highest level of refinement, demonstrating the ability of PPM to handle sharp flow features well. Near the contact discontinuity and in the rarefaction we find small errors of about 1-2% in the density and specific internal energy, with similar errors in the velocity inside the rarefaction. Elsewhere the numerical solution is exact; no oscillation is present.
Figure 11 shows the result of running the Sod problem on the same two-dimensional grid with different shock normals: parallel to the x-axis (q = 0°) and along the box diagonal (q = 45°). For the diagonal solution we have interpolated values of density, specific internal energy, and velocity to a set of 256 points spaced exactly as in the x-axis solution. This comparison shows the effects of the second-order directional splitting used with FLASH on the resolution of shocks. At the right side of the rarefaction and at the contact discontinuity the diagonal solution undergoes slightly larger oscillations (on the order of a few percent) than the x-axis solution. Also, the value of each variable inside the discontinuity regions differs between the two solutions by up to 10% in most cases. However, the location and thickness of the discontinuities is the same between the two solutions. In general shocks at an angle to the grid are resolved with approximately the same number of zones as shocks parallel to a coordinate axis.
Figure 12 presents a grayscale map of the density at t=0.2 in the diagonal solution together with the block structure of the AMR grid in this case. Note that regions surrounding the discontinuities are maximally refined, while behind the shock and discontinuity the grid has de-refined as the second derivative of the density has decreased in magnitude. Because zero-gradient outflow boundaries were used for this test, some reflections are present at the upper left and lower right corners, but at t=0.2 these have not yet propagated to the center of the grid.
This problem was originally used by Woodward and Colella (1984) to compare the performance of several different hydrodynamical methods on problems involving strong, thin shock structures. It has no analytical solution, but since it is one-dimensional, it is easy to produce a converged solution by running the code with a very large number of zones, permitting an estimate of the self-convergence rate when narrow, interacting discontinuities are present. For FLASH it also provides a good test of the adaptive mesh refinement scheme.
The initial conditions consist of two parallel, planar flow discontinuities.
Reflecting boundary conditions are used. The density in the left, middle,
and right portions of the grid (rL, rM, and
rR, respectively) is unity; everywhere the velocity is zero.
The pressure is large to the left and right and small in the center:
| (132) |
Figure 13 shows the density and velocity profiles at several different times in the converged solution, demonstrating the complexity inherent in this problem. The initial pressure discontinuities drive shocks into the middle part of the grid; behind them, rarefactions form and propagate toward the outer boundaries, where they are reflected back onto the grid and interact with themselves. By the time the shocks collide at t=0.028, the reflected rarefactions have caught up to them, weakening them and making their post-shock structure more complex. Because the right-hand shock is initially weaker, the rarefaction on that side reflects from the wall later, so the resulting shock structures going into the collision from the left and right are quite different. Behind each shock is a contact discontinuity left over from the initial conditions (at x » 0.50 and 0.73). The shock collision produces an extremely high and narrow density peak; in the Woodward and Colella calculation the peak density is slightly less than 30. Even with ten levels of refinement, FLASH obtains a value of only 18 for this peak. Reflected shocks travel back into the colliding material, leaving a complex series of contact discontinuities and rarefactions between them. A new contact discontinuity has formed at the point of the collision (x » 0.69). By t=0.032 the right-hand reflected shock has met the original right-hand contact discontinuity, producing a strong rarefaction which meets the central contact discontinuity at t=0.034. Between t=0.034 and t=0.038 the slope of the density behind the left-hand shock changes as the shock moves into a region of constant entropy near the left-hand contact discontinuity.
Figure 14 shows the self-convergence of density
and pressure when FLASH is run on
this problem.
For several runs with different maximum refinement levels, we
compare the density, pressure, and total specific energy at t=0.038
to the solution obtained using FLASH with ten levels of refinement.
This figure plots the L1 error norm for
each variable u, defined using
| (133) |
Although PPM is formally a second-order method, one sees from this plot that, for the interacting blast-wave problem, the convergence rate is only linear. Indeed, in their comparison of the performance of seven nominally second-order hydrodynamic methods on this problem, Woodward and Colella found that only PPM achieved even linear convergence; the other methods were worse. The error norm is very sensitive to the correct position and shape of the strong, narrow shocks generated in this problem.
The additional runtime parameters supplied with the 2blast problem are listed in Table 30. This problem is configured to use the perfect-gas equation of state (gamma), and it is run in a two-dimensional unit box with gamma set to 1.4. Boundary conditions in the y direction (transverse to the shock normals) are taken to be periodic.
Variable | Type | Default | Description |
rho_left | real | 1 | Initial density to the left of the left interface (rL) |
rho_mid | real | 1 | Initial density between the interfaces (rM) |
rho_right | real | 1 | Initial density to the right of the right interface (rR) |
p_left | real | 1000 | Initial pressure to the left (pL) |
p_mid | real | 0.01 | Initial pressure in the middle (pM) |
p_right | real | 100 | Initial pressure to the right (pR) |
u_left | real | 0 | Initial velocity (perpendicular to interface) to the left (uL) |
u_mid | real | 0 | Initial velocity (perpendicular to interface) in the middle (uL) |
u_right | real | 0 | Initial velocity (perpendicular to interface) to the right (uR) |
xangle | real | 0 | Angle made by interface normal with the x-axis (degrees) |
yangle | real | 90 | Angle made by interface normal with the y-axis (degrees) |
posnL | real | 0.1 | Point of intersection between the left interface plane and the x-axis |
posnR | real | 0.9 | Point of intersection between the right interface plane and the x-axis |
The Sedov explosion problem (Sedov 1959) is another purely hydrodynamical
test in which we check the code's ability to deal with strong shocks
and non-planar symmetry. The problem involves the self-similar evolution
of a cylindrical or
spherical blast wave from a delta-function initial pressure perturbation
in an otherwise homogeneous medium. To initialize the code,
we deposit a quantity of energy E=1 into a
small region of radius dr at the center of the grid.
The pressure inside this volume, p0¢, is given by
| (134) |
| (135) |
|
|
Figure 15 shows density, pressure, and velocity profiles in the two-dimensional Sedov problem at t=0.05. Solutions obtained with FLASH on grids with 2, 4, 6, and 8 levels of refinement are shown in comparison with the analytical solution. In this figure we have computed average radial profiles in the following way. We interpolated solution values from the adaptively gridded mesh used by FLASH onto a uniform mesh having the same resolution as the finest AMR blocks in each run. Then, using radial bins with the same width as the zones in the uniform mesh, we binned the interpolated solution values, computing the average value in each bin. At low resolutions, errors show up as density and velocity overestimates behind the shock, underestimates of each variable within the shock, and a numerical precursor spanning 1-2 zones in front of the shock. However, the central pressure is accurately determined, even for two levels of refinement; because the density goes to a finite value rather than its correct limit of zero, this corresponds to a finite truncation of the temperature (which should go to infinity as r® 0). As resolution improves, the artificial finite density limit decreases; by Nref=6 it is less than 0.2% of the peak density. Except for the Nref=2 case, which does not show a well-defined peak in any variable, the shock itself is always captured with about two zones. The region behind the shock containing 90% of the swept-up material is represented by four zones in the Nref=4 case, 17 zones in the Nref=6 case, and 69 zones for Nref=8. However, because the solution is self-similar, for any given maximum refinement level the shock will be four zones wide at a sufficiently early time. The behavior when the shock is underresolved is to underestimate the peak value of each variable, particularly the density and pressure.
Figure 16 shows the pressure field in the 8-level calculation at t=0.05 together with the block refinement pattern. Note that a relatively small fraction of the grid is maximally refined in this problem. Although the pressure gradient at the center of the grid is small, this region is refined because of the large temperature gradient there. This illustrates the ability of PARAMESH to refine grids using several different variables at once.
We have also run FLASH on the spherically symmetric Sedov problem in order to verify the code's performance in three dimensions. The results at t=0.05 using five levels of grid refinement are shown in Figure 17. In this figure we have plotted the root-mean-square (RMS) numerical solution values in addition to the average values. As in the two-dimensional runs, the shock is spread over about two zones at the finest AMR resolution in this run. The width of the pressure peak in the analytical solution is about 1 1/2 zones at this time, so the maximum pressure is not captured in the numerical solution. Behind the shock the numerical solution average tracks the analytical solution quite well, although the Cartesian grid geometry produces RMS deviations of up to 40% in the density and velocity in the derefined region well behind the shock. This behavior is similar to that exhibited in the two-dimensional problem at comparable resolution.
Variable | Type | Default | Description |
p_ambient | real | 10-5 | Initial ambient pressure (p0) |
rho_ambient | real | 1 | Initial ambient density (r0) |
exp_energy | real | 1 | Explosion energy (E) |
r_init | real | 0.05 | Radius of initial pressure perturbation (dr) |
xctr | real | 0.5 | x-coordinate of explosion center |
yctr | real | 0.5 | y-coordinate of explosion center |
zctr | real | 0.5 | z-coordinate of explosion center |
The additional runtime parameters supplied with the sedov problem are listed in Table 31. This problem is configured to use the perfect-gas equation of state (gamma), and it is run in a unit box with gamma set to 1.4.
In this problem we create a planar density pulse in a region of
uniform pressure p0 and velocity u0, with the velocity normal to
the pulse plane. The density pulse is defined via
| (138) |
| (139) |
| (140) |
The advection problem tests the ability of the code to handle planar geometry, as does the Sod problem. It is also like the Sod problem in that it tests the code's treatment of flow discontinuities which move at one of the characteristic speeds of the hydrodynamical equations. (This is difficult because noise generated at a sharp interface, such as the contact discontinuity in the Sod problem, tends to move with the interface, accumulating there as the calculation advances.) However, unlike the Sod problem it compares the code's treatment of leading and trailing contact discontinuities (for the square pulse), and it tests the treatment of narrow flow features (for both the square and Gaussian pulse shapes). Many hydrodynamical methods have a tendency to clip narrow features or to distort pulse shapes by introducing artificial dispersion and dissipation (Zalesak 1987).
Variable | Type | Default | Description |
rhoin | real | 1 | Characteristic density inside the advected pulse (r1) |
rhoout | real | 10-5 | Ambient density (r0) |
pressure | real | 1 | Ambient pressure (p0) |
velocity | real | 10 | Ambient velocity (u0) |
width | real | 0.1 | Characteristic width of advected pulse (w) |
pulse_fctn | integer | 1 | Pulse shape function to use: 1=square wave, 2=Gaussian |
xangle | real | 0 | Angle made by pulse plane with x-axis (degrees) |
yangle | real | 90 | Angle made by pulse plane with y-axis (degrees) |
posn | real | 0.25 | Point of intersection between pulse midplane and x-axis |
The additional runtime parameters supplied with the advect problem are listed in Table 32. This problem is configured to use the perfect-gas equation of state (gamma), and it is run in a unit box with gamma set to 1.4. (The value of g does not affect the analytical solution, but it does affect the timestep.)
To demonstrate the performance of FLASH on the advection problem, we have performed tests of both the square and Gaussian pulse profiles with the pulse normal parallel to the x-axis (q = 0°) and at an angle to the x-axis (q = 45°) in two dimensions. The square pulse used r1=1, r0=10-3, p0=10-6, u0=1, and w=0.1. With six levels of refinement in the domain [0,1]×[0,1], this value of w corresponds to having about 52 zones across the pulse width. The Gaussian pulse tests used the same values of r1, r0, p0, and u0, but with w=0.015625. This value of w corresponds to about 8 zones across the pulse width at six levels of refinement. For each test we performed runs at two, four, and six levels of refinement to examine the quality of the numerical solution as the resolution of the advected pulse improves. The runs with q = 0° used zero-gradient (outflow) boundary conditions, while the runs performed at an angle to the x-axis used periodic boundaries.
Figure 18 shows, for each test, the advected density profile at t=0.4 in comparison with the analytical solution. The upper two frames of this figure depict the square pulse with q = 0° and q = 45°, while the lower two frames depict the Gaussian pulse results. In each case the analytical density pulse has been advected a distance u0t=0.4, and in the figure the axis parallel to the pulse normal has been translated by this amount, permitting comparison of the pulse displacement in the numerical solutions with that of the analytical solution.
The advection results show the expected improvement with increasing AMR refinement level Nref. Inaccuracies appear as diffusive spreading, rounding of sharp corners, and clipping. In both the square pulse and Gaussian pulse tests, diffusive spreading is limited to about one zone on either side of the pulse. For Nref=2 the rounding of the square pulse and the clipping of the Gaussian pulse are quite severe; in the latter case the pulse itself spans about two zones, which is the approximate smoothing length in PPM for a single discontinuity. For Nref=4 the treatment of the square pulse is significantly better, but the amplitude of the Gaussian is still reduced by about 50%. In this case the square pulse discontinuities are still being resolved with 2-3 zones, but the zones are now a factor of 25 smaller than the pulse width. With six levels of refinement the same behavior is observed for the square pulse, while the amplitude of the Gaussian pulse is now 93% of its initial value. The absence of dispersive effects (ie. oscillation) despite the high order of the PPM interpolants is due to the enforcement of monotonicity in the PPM algorithm.
The diagonal runs are consistent with the runs which were parallel to the x-axis, with the possibility of a slight amount of extra spreading behind the pulse. However, note that we have determined density values for the diagonal runs by interpolation along the grid diagonal. The interpolation points are not centered on the pulses, so the density does not always take on its maximum value (particularly in the lowest-resolution case).
These results are consistent with earlier studies of linear advection with PPM (e.g., Zalesak 1987). They suggest that, in order to preserve narrow flow features in FLASH, the maximum AMR refinement level should be chosen so that zones in refined regions are at least a factor 5-10 smaller than the narrowest features of interest. In cases in which the features are generated by shocks (rather than moving with the fluid), the resolution requirement is not as severe, as errors generated in the preshock region are driven into the shock rather than accumulating as it propagates.
The problem of a wind tunnel containing a step was first described by Emery (1968), who used it to compare several hydrodynamical methods which are only of historical interest now. Woodward and Colella (1984) later used it to compare several more advanced methods, including PPM. Although it has no analytical solution, this problem is useful because it exercises a code's ability to handle unsteady shock interactions in multiple dimensions. It also provides an example of the use of FLASH to solve problems with irregular boundaries.
The problem uses a two-dimensional rectangular domain three units wide
and one unit high. Between x=0.6 and x=3 along the x-axis is a
step 0.2 units high. The step is treated as a reflecting boundary, as
are the lower and upper boundaries in the y direction. For the right-hand
x boundary we use an outflow (zero gradient) boundary condition, while
on the left-hand side we use an inflow boundary. In the inflow boundary
zones we set the density to r0, the pressure to p0, and the
velocity to u0, with the latter directed parallel to the x-axis.
The domain itself is also initialized with these values. For the
Emery problem we use
| (141) |
|
|
The additional runtime parameters supplied with the windtunnel problem are listed in Table 33. This problem is configured to use the perfect-gas equation of state (gamma) with gamma set to 1.4. We also set xmax=3, ymax=1, Nblockx=15, and Nblocky=4 in order to create a grid with the correct dimensions. The version of divide_domain supplied with this problem adds three top-level blocks along the lower left-hand corner of the grid to cover the region in front of the step. Finally, we use xlboundary=-23 (user boundary condition) and xrboundary=-21 (outflow boundary) to instruct FLASH to use the correct boundary conditions in the x direction. Boundaries in the y direction are reflecting (-20).
Until t=12 the flow is unsteady, exhibiting multiple shock reflections and interactions between different types of discontinuity. Figure 19 shows the evolution of density and velocity between t=0 and t=4 (the period considered by Woodward and Colella). Immediately a shock forms directly in front of the step and begins to move slowly away from it. Simultaneously the shock curves around the corner of the step, extending farther downstream and growing in size until it strikes the upper boundary just after t=0.5. The corner of the step becomes a singular point, with a rarefaction fan connecting the still gas just above the step to the shocked gas in front of it. Entropy errors generated in the vicinity of this singular point produce a numerical boundary layer about one zone thick along the surface of the step. Woodward and Colella reduce this effect by resetting the zones immediately behind the corner to conserve entropy and the sum of enthalpy and specific kinetic energy through the rarefaction. However, we are less interested here in reproducing the exact solution than in verifying the code and examining the behavior of such numerical effects as resolution is increased, so we do not apply this additional boundary condition. The errors near the corner result in a slight overexpansion of the gas there and a weak oblique shock where this gas flows back toward the step. At all resolutions we also see interactions between the numerical boundary layer and the reflected shocks which appear later in the calculation.
By t=1 the shock reflected from the upper wall has moved downward and has almost struck the top of the step. The intersection between the primary and reflected shocks begins at x » 1.45 when the reflection first forms at t » 0.65, then moves to the left, reaching x » 0.95 at t=1. As it moves, the angle between the incident shock and the wall increases until t=1.5, at which point it exceeds the maximum angle for regular reflection (40° for g = 1.4) and begins to form a Mach stem. Meanwhile the reflected shock has itself reflected from the top of the step, and here too the point of intersection moves leftward, reaching x » 1.65 by t=2. The second reflection propagates back toward the top of the grid, reaching it at t=2.5 and forming a third reflection. By this time in low-resolution runs we see a second Mach stem forming at the shock reflection from the top of the step; this results from the interaction of the shock with the numerical boundary layer, which causes the angle of incidence to increase faster than in the converged solution. Figure 20 compares the density field at t=4 as computed by FLASH using several different maximum levels of refinement. Note that the size of the artificial Mach reflection diminishes as resolution improves.
The shear zone behind the first (``real'') Mach stem produces another interesting numerical effect, visible at t=3 and t=4: Kelvin-Helmholtz amplification of numerical errors generated at the shock intersection. The wave thus generated propagates downstream and is refracted by the second and third reflected shocks. This effect is also seen in the calculations of Woodward and Colella, although their resolution was too low to capture the detailed eddy structure we see. Figure 21 shows the detail of this structure at t=3 on grids with several different levels of refinement. The effect does not disappear with increasing resolution, for two reasons. First, the instability amplifies numerical errors generated at the shock intersection, no matter how small. Second, PPM captures the slowly moving, nearly vertical Mach stem with only 1-2 zones on any grid, so as it moves from one column of zones to the next, artificial kinks form near the intersection, providing the seed perturbation for the instability. This effect can be reduced by using a small amount of extra dissipation to smear out the shock, as discussed by Colella and Woodward (1984). This tendency of physical instabilities to amplify numerical noise vividly demonstrates the need to exercise caution when interpreting features in supposedly converged calculations.
Finally, we note that in high-resolution runs with FLASH we also see some Kelvin-Helmholtz rollup at the numerical boundary layer along the top of the step. This is not present in Woodward and Colella's calculation, presumably because their grid resolution is lower (corresponding to two levels of refinement for us) and because of their special treatment of the singular point.
Variable | Type | Default | Description |
p_ambient | real | 1 | Ambient pressure (p0) |
rho_ambient | real | 1.4 | Ambient density (r0) |
wind_vel | real | 3 | Inflow velocity (u0) |
The Shu-Osher problem (Shu and Osher, 1989) tests a shock-capturing scheme's ability to resolve small-scale flow features. It gives a good indication of the numerical (artificial) viscosity of a method. Since it is designed to test shock-capturing schemes, the equations of interest are the one-dimensional Euler equations for a single-species perfect gas.
In the problem, a (nominally) Mach 3 shock wave propagates into a sinusoidal density field. As the shock advances, two sets of density features appear behind the shock. One set has the same spatial frequency as the unshocked perturbations, but for the second set the frequency is doubled. Furthermore, the second set follows more closely behind the shock None of these features are spurious. The test of the numerical method is to accurately resolve the dynamics and strengths of the oscillations behind the shock.
The shu_osher problem is initialized as follows.
On the domain -4.5 £ x £ 4.5, the shock is at x=xs at
t=0.0.
On either side of the shock,
| (142) |
The problem is strictly one-dimensional; building 2d or 3d executables should give the same results along each x-direction grid line. For this problem, special boundary conditions are applied. The initial conditions should not change at the boundaries, and if they do, errors at the boundaries can contaminate the results. To avoid this possibility, a boundary condition subroutine was written to set the boundary values to their initial values.
Variable | Type | Default | Description |
posn | real | -4.0 | Initial shock location (xs) |
rho_left | real | 3.857143 | Initial density to the left of the shock (rL) |
rho_right | real | 1.0 | Nominal initial density to the right (rR) |
p_left | real | 10.33333 | Initial pressure to the left (pL) |
p_right | real | 1.0 | Initial pressure to the right (pR) |
u_left | real | 2.629369 | Initial velocity to the left (uL) |
u_right | real | 0.0 | Initial velocity to the right (uR) |
a_rho | real | 0.2 | Amplitude of the density perturbations |
f_rho | real | 5.0 | Frequency of the density perturbations |
The purpose of the tests is to determine how much resolution, in terms of mesh cells per feature, a particular method requires to accurately represent small scale flow features. Therefore all computations are carried out on equispaced meshes without adaptive refinement. Solutions are obtained at t=1.8. The reference solution, using 3200 mesh cells, is shown in Fig. . This solution was computed using PPM and Strang splitting (the default hydro and driver modules) at a CFL number of 0.8. Note the shock located at x @ 2.4, and the high frequency density oscillations just to the left of the shock. When the grid resolution is insufficient, shock-capturing schemes underpredict the amplitude of these oscillations and may distort their shape.
Figure 24 show the density field for the same scheme at 400 mesh cells and at 200 mesh cells. With 400 cells, the amplitudes are only slightly reduced compared to the reference solution; however, the shapes of the oscillations have been distorted. The slopes are steeper and the peaks and troughs are broader, which is most likely the result of overcompression from the contact-steepening part of the PPM algorithm. For the solution on 200 mesh cells, the amplitudes of the high-frequency oscillations are significantly underpredicted.
The Brio-Wu MHD shock tube problem (Brio and Wu 1988) is a coplanar magnetohydrodynamic counterpart of the hydrodynamic Sod problem (section 12.1.1). The initial left and right states are, respectively, rl=1, ul=vl=0, pl=1, (By)l=1; and rr=0.125, ur=vr=0, pr=0.1, (By)r=-1. In addition, Bx=0.75 and g = 2. This is a good problem to test wave properties of a particular MHD solver, because it involves two fast rarefaction waves, a slow compound wave, a contact discontinuity and a slow shock wave.
The conventional 800 point solution to this problem computed with the FLASH 2.0 code is presented in Figures 25, 26, 27, 28, 29 . The figures show the distribution of density, normal and tangential velocity components, tangential magnetic field component and pressure at t=0.1 (in non-dimensional units). As can bee seen, the code accurately and sharply resolves all waves present in the solution. There is a small undershoot in the solution at x » 0.44, which results from a discontinuity-enhancing monotonized centered gradient limiting function (LeVeque 1997). This undershoot can be easily removed if a less aggressive limiter, for example minmod or van Leer limiter, is used instead. This, however, will degrade the sharp resolution of other discontinuities.
The linear instability of self-gravitating fluids was first explored by Jeans (1902) in connection with the problem of star formation. The nonlinear phase of the instability is of greatest astrophysical interest nowadays, but the linear instability provides a very useful test of the coupling of gravity to hydrodynamics in FLASH.
The jeans problem allows one to examine the behavior of sinusoidal,
adiabatic
density perturbations in both the pressure-dominated and gravity-dominated
limits. This problem uses periodic boundary conditions.
The equation of state is that of a perfect gas.
The initial conditions at t=0 are
|
| (144) |
| (145) |
| (146) |
We checked the dispersion relation (146) for
stable perturbations with g = 5/3 by fixing r0 and p0 and
performing several runs with different k. We followed each case for roughly
five oscillation periods using a uniform grid in the box
[0,L]2. We used r0 = 1.5×107 g cm-3 and
p0 = 1.5×107 dyn cm-2, yielding kJ = 2.747 cm-1.
The perturbation amplitude d was fixed at 10-3.
The box size L is chosen so that kJ is smaller than the smallest nonzero
wavenumber which can be resolved on the grid:
| (147) |
The resulting kinetic, thermal, and potential energies as functions of time
for one choice of k are shown in Figure 30
together with the analytic solution, which is given in two dimensions by
|
|
|
|
|
The additional runtime parameters supplied with the jeans problem are listed in Table 35. This problem is configured to use the perfect-gas equation of state (gamma), and it is run in a two-dimensional unit box with gamma set to 1.67. The refinement marking routine (ref_marking.F90) supplied with this problem refines blocks whose mean density exceeds a given threshold. Since the problem is not spherically symmetric, the multigrid Poisson solver should be used.
Variable | Type | Default | Description | |||||||||||
rho0 | real | 1 | Initial unperturbed density (r0) | |||||||||||
p0 | real | 1 | Initial unperturbed pressure (p0) | |||||||||||
amplitude | real | 0.01 | Perturbation amplitude (d) | |||||||||||
lambdax | real | 1 | Perturbation wavelength in x direction (lx = 2p/kx) | |||||||||||
lambday | real | 1 | Perturbation wavelength in y direction (ly = 2p/ky) | |||||||||||
lambdaz | real | 1 | Perturbation wavelength in z direction (lz = 2p/kz) | |||||||||||
delta_ref | real | 0.1 | Refine a block if the maximum density contrast relative to rref is greater than this | |||||||||||
delta_deref | real | 0.1 | Derefine a block if the maximum density contrast relative to rref is less than this | |||||||||||
reference_density | real | 1 | Reference density for grid refinement
(rref). Density contrast is
used to determine which blocks to refine;
it is defined as
| (150) |
The homologous dust collapse problem is used to test the ability of the code to solve self-gravitating problems in which the flow geometry is spherical and gas pressure is negligible. The problem was first described by Colgate and White (1966) and has been used by Mönchmeyer and Müller (1989) to test hydrodynamical schemes in curvilinear coordinates. As the Poisson solvers currently included with FLASH do not yet work in curvilinear coordinates, we solve this problem using a 3D Cartesian grid.
The initial conditions consist of a uniform sphere of radius r0 and
density r0 at rest. The pressure p0 is taken to be constant and
very small:
| (151) |
The collapse of the dust sphere is self-similar; the cloud should remain
spherical and uniform as it collapses. The radius of the cloud, r(t),
should satisfy
| (152) |
Results of a dust_coll run using FLASH 1.62 appear in Figure 32. This run used 43 top-level blocks and seven levels of refinement, for an effective resolution of 20483 at the center of the grid. The multipole Poisson solver was used with a maximum multipole moment l = 0. The initial conditions used r0 = 109 g cm-3 and r0 = 6.5×108 cm. In Figure 32a, the density, pressure, and velocity are scaled by 2.43×109 g cm-3, 2.08×1017 dyn cm-2, and 7.30×109 cm s-1, respectively. In Figure 32b they are scaled by 1.96×1011 g cm-3, 2.08×1017 dyn cm-2, and 2.90×1010 cm s-1. Note that within the cloud the profiles are very isotropic, as indicated by the small dispersion in each profile. Significant anisotropy is only present for `fluff' material flowing in through the Cartesian boundaries. In particular, it is encouraging that the velocity field remains isotropic all the way into the center of the grid; this shows the usefulness of refining spherically symmetric problems near r=0. However, as material flows inward past refinement boundaries, small ripples develop in the density profile due to interpolation errors. These remain spherically symmetric but increase in amplitude as they are compressed. Nevertheless, they are still only a few percent in relative magnitude by the second frame. The other numerical effect of note is a slight spreading at the edge of the cloud. This does not appear to worsen significantly with time. If one takes the radius at which the density drops to one-half its central value as the radius of the cloud, then the observed collapse factor agrees with our expectation from equation (152). Overall our results, including the numerical effects, agree well with those of Mönchmeyer and Müller (1989).
The additional runtime parameters supplied with the dust_coll problem are listed in Table 36. This problem is configured to use the perfect-gas equation of state (gamma), and it is run in a three-dimensional box with gamma set to 1.67. The refinement marking routine (ref_marking.F90) supplied with this problem refines blocks containing the center of the cloud. Since the problem is spherically symmetric, either the multigrid or multipole solvers can be used.
Variable | Type | Default | Description |
rho0 | real | 1 | Initial cloud density (r0) |
R_init | real | 0.05 | Initial cloud radius (r0) |
T_ambient | real | 1 | Initial ambient temperature |
xctr | real | 0.5 | x-coordinate of cloud center |
yctr | real | 0.5 | y-coordinate of cloud center |
zctr | real | 0.5 | z-coordinate of cloud center |
The poistest problem tests the convergence properties of
the multigrid Poisson solver on a multidimensional, highly (locally) refined
grid. This problem is described by Huang and Greengard (2000).
The source function consists of a sum of thirteen two-dimensional
Gaussians:
| (153) |
i | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
xi | 0 | -1 | -1 | 0.28125 | 0.5 | 0.3046875 | 0.3046875 |
yi | 0 | 0.09375 | 1 | 0.53125 | 0.53125 | 0.1875 | 0.125 |
si | 0.01 | 4000 | 20000 | 80000 | 16 | 360000 | 400000 |
i | 8 | 9 | 10 | 11 | 12 | 13 | |
xi | 0.375 | 0.5625 | -0.5 | -0.125 | 0.296875 | 0.5234375 | |
yi | 0.15625 | -0.125 | -0.703125 | -0.703125 | -0.609375 | -0.78125 | |
si | 2000 | 18200 | 128 | 49000 | 37000 | 18900 |
The poistest problem uses no additional runtime parameters beyond those required by the rest of the code.
Frequently when doing simulations, one needs to initialize the computational domain with a one-dimensional model from a stellar evolution (or other) code. A simple framework for accomplishing this task is provided by the sample_map problem. This is intended to be a template for users to modify to suit their needs.
This problem is composed of two main routines, init_1d and the familiar init_block. init_1d reads the initial model from disk, determines which variables are present and how they map into the variables defined in FLASH, and stores the initial model in arrays that are then used by init_block. The general format of an initial model file is a single comment line, a line giving the number of variables contained in the initial model, the 4-character names of each variable (one per line), followed by the data (spatial coordinate first), with the variables in the same order as the list of names. An example of this format follows.
# sample 1-d model
number of variables = 7
dens
pres
ener
gamc
game
fuel
ash
0.01 10. 100. 25. 1.4 1.4 1.0 0.0
0.02 9.5. 95. 25. 1.4 1.4 0.99 0.0
:
In the above sample file, we define seven variables. The first zone starts with the coordinate of the zone center (0.01) and then lists the density (10.), pressure (100.), and so forth, with one entry for each variable per line. The next zone of the initial model is listed immediately below this line. init_1d will continue to read in zones for the initial model until it encounters the end of the file.
FLASH contains more variables than the seven defined in this input file, and it will initialize any variables not specified in the input file to zero. Additionally, sometimes a variable is specified in the input file, but there is no corresponding variable defined in FLASH. In this case, init_1d will produce a warning, listing the variables it does not know about. Finally, there is no need for the variables to be listed in the same order as they are stored in the FLASH data structures-they will be sorted as each zone is read from the initial model.
The inital model is stored in two data structures: xzn(N1D_MAX) contains the coordinates of the initial model zone centers, and model_1d(N1D_MAX, nvar) contains the values of the variables defined in the initial model. These are stored in the same order as the variables in the solution array unk maintained by FLASH. N1D_MAX is a parameter specifying the maximum number of zones in the initial model (currently set to 2048).
These data structures are passed to the init_block function which loops over all of the zones in the current block, determines the x-, y-, and z-coordinates of the zone, and performs an interpolation to find the values of the initial variables in the current zone. This interpolation attempt to construct as zone average from the values of the initial model at the zone edges and center.
There are two parameters for this problem, model_file is a string that gives the name of the input file to read the initial model from. imap_dir is an integer the specifies the direction to map the initial model along, imap_dir = 1 maps along the x-direction, 2 maps along the y direction, and 0 maps it in a circle in the x-y plane.
The setup script, found in the FLASH root directory, provides the primary command-line interface to the FLASH source code. It configures the source tree for a given problem and target machine and creates files needed to parse the runtime parameter file and make the FLASH executable. More description of what setup does may be found in Section 3. Here we describe its basic usage.
Running setup without any options prints a message describing the available options:
[sphere 5:09pm] % ./setupFor compatibility with older versions of setup, the syntax
usage: setup <problem-name> [options]
problems: see /home/user/FLASH2.0/setups/
options: -verbose -portable -auto -[123]d -maxblocks=<#>
[-site=<site> | -ostype=<ostype>] [-debug | -test] -preprocess -report -objdir=<relative obj directory>
setup <problem-name> <ostype> [options]is also accepted, so long as <ostype> does not begin with a dash (-). In this case the -site and -ostype options cannot be used. Available values for the mandatory option (the name of the problem to configure) are determined by scanning the setups/ directory.
A ``problem'' consists of a set of initial and boundary conditions, possibly additional physics (e.g., a subgrid model for star formation), and a set of adjustable parameters. The directory associated with a problem contains source code files which implement the initial conditions and, in a few cases, the boundary conditions and extra physics, together with a configuration file, read by setup, which contains information on required physics modules and adjustable parameters.
setup determines site-dependent configuration information by looking in source/sites/ for a directory with the same name as the output of the hostname command; failing this, it looks in source/sites/Prototypes/ for a directory with the same name as the output of the uname command. The site and operating system type can be overridden with the -site and -ostype command-line options. Only one of these options can be used. The directory for each site or operating system type contains a makefile fragment ( Makefile.h) that sets command names, compiler flags, and library paths, and any replacement or additional source files needed to compile FLASH for that machine type.
setup uses the problem and site/OS type, together with a user-supplied file called Modules which lists the code modules to include, to generate a directory called object/ which contains links to the appropriate source files and makefile fragments. It also creates the master makefile (object/Makefile) and several Fortran include files that are needed by the code in order to parse the runtime parameter file. After running setup, the user can make the FLASH executable by running gmake in the object/ directory (or from the FLASH root directory, if the -portable option is not used with setup). The optional command-line modifiers have the following interpretations:
-verbose | Normally setup echoes to the standard output summary messages indicating what it is doing. Including the -verbose option causes it to also list the links it creates. |
-portable | This option creates a portable build directory. Instead of object/, setup creates object_problem/, and it copies instead of linking to the source files in source/ and setups/. The resulting build directory can be placed into a tar archive and sent to another machine for building (use the Makefile created by setup in the tar file). |
-auto | This modifier replaces -defaults, which is still present in the code but has been deprecated. Normally setup requires that the user supply a plain text file called Modules (in the FLASH root directory) which specifies which code modules to include. A sample Modules file appears in Figure 34. Each line is either a comment (preceded by a hash mark (#)) or a module include statement of the form INCLUDE module. Sub-modules are indicated by specifying the path to the sub-module in question; in the example, the sub-module gamma of the eos module is included. If a module has a default sub-module, but no sub-module is specified, setup automatically selects the default using the module's configuration file. |
The -auto option enables setup to generate a ``rough draft'' of a Modules file for the user. The configuration file for each problem setup specifies a number of code module requirements; for example, a problem may require the perfect-gas equation of state (materials/eos/gamma) and an unspecified hydro solver (hydro). With -auto, setup creates a Modules file by converting these requirements into module include statements. In addition, it checks the configuration files for the required modules and includes any of their required modules, eliminating duplicates. Most users configuring a problem for the first time will want to run setup with -auto to generate a Modules file, then edit Modules directly to specify different sub-modules. After editing Modules in this way, re-run setup without -auto to incorporate the changes into the code configuration. | |
-[123]d | By default setup creates a makefile which produces a FLASH executable capable of solving two-dimensional problems (equivalent to -2d). To generate a makefile with options appropriate to three-dimensional problems, use -3d. To generate a one-dimensional code, use -1d. These options are mutually exclusive and cause setup to add the appropriate compilation option to the makefile it generates. |
-maxblocks=# | This option is also used by setup in constructing the makefile compiler options. It determines the amount of memory allocated at runtime to the adaptive mesh refinement (AMR) block data structure. For example, to allocate enough memory on each processor for 500 blocks, use -maxblocks=500. If the default block buffer size is too large for your system, you may wish to try a smaller number here (the defaults are currently defined in source/driver/physicaldata.fh). Alternatively, you may wish to experiment with larger buffer sizes if your system has enough memory. |
-debug | The default Makefile built by setup will use the optimized setting for compilation and linking. Using -debug will force setup to use the flags relevant for debugging (e.g., including -g in the compilation line). |
-test | When FLASH is tested by the automated test suite, test will choose the paper compilation arguments for the test executable. |
-preprocess | This option will preprocess all of the files before compilation. This is useful for machines whose compilers do not support preprocessing. |
-report | For setup to list all of the modules used by the current configuration. |
-objdir | Overrides the default object directory with one whose name is specified by this parameter. |
# Modules file constructed for rt problem by setup -auto |
To set runtime parameters to values other than the defaults, create a runtime parameter file named flash.par in the directory from which FLASH is to be run. The format of this file is described briefly in Section , and in more detail in Section 16.3.
Setup also creates two functions that are used by FLASH. buildstamp takes a file logical unit number and outputs the date and time the current FLASH executable was setup, along with the platform information. flash_release returns a character string containing the full version number (including the minor version number) of the present build of FLASH.
sfocu is intended as a replacement for focu (available in previous versions of FLASH) and is mainly used as part of an automated testing suite called flash_test.
sfocu is a serial utility which examines two FLASH checkpoint files and decides whether or not they are ``equal'' to ensure that any changes made to FLASH do not adversely affect subsequent simulation output. By ``equal,'' we mean that:
Thus, sfocu ignores information such as the particular numbering of the blocks, the timestamp, the build information, and so on.
sfocu can read both HDF4 and HDF5 FLASH checkpoint files. It does not support f77 checkpoint files, and has not been tested with FLASH checkpoints that span multiple files. Although sfocu is a serial program, it is able to do comparisons on the output of large parallel simulations. sfocu has been used on irix, linux, AIX and OSF1.
The process is entirely manual, although Makefiles for certain machines have been provided. There are a few compile-time options which you set via the following preprocessor definitions in the Makefile:
There are no command line options. Simply run the command sfocu <file1> <file2>. Sample output follows:
Comparing: a/windtunnel_4lev_hdf_chk_0001 b/windtunnel_4lev_hdf_chk_0001 Norm used: d(a,b) = abs(2(a-b)) / max(abs(a+b), 1e-99) Total leaf blocks compared: 1032 Var Bad Blocks Min Error Max Error =================================================== pres 868 0 6.59229e-11 temp 880 0 1.95468e-10 gamc 0 0 0 dens 867 0 5.98321e-11 velx 866 0 1.87844e-11 vely 1032 0 2773.88 velz 0 0 0 ener 867 0 3.41799e-11 game 0 0 0 1 885 0 1.95472e-10 FAILURE
"Bad Blocks" is the number of leaf blocks where the data was found to differ between datasets; "Min Error" and "Max Error" are the minimum and maximum of the norm defined in the output. The last line makes it easier for other programs to parse sfocu output: when the files are identical, the line will instead read SUCCESS.
If the SHORT_REPORT compile-time option isn't used, sfocu will also report the maximum, minimum and sum of the variables in the two files, to give you an idea of the scales involved. Note that the error norm is dimensionless.
It's possible for sfocu to miss machine-precision variations in the data on certain machines because of compiler or library issues. Or possibly even bugs (!). This has only been observed on one platform, where the compiler produced code that ignored IEEE rules until the right flag was found.
fidlr is a set of routines, written in the IDL, that can plot or read data files produced by FLASH. The routines include programs which can be run from the IDL command line to read 1D, 2D, or 3D FLASH datasets, interactively analyze datasets, and interpolate them onto uniform grids. Other IDL programs (xflash and cousins) provide a graphical interface to these routines, enabling users to read and plot AMR datasets.
Currently, the fidlr routines support 1, 2, and 3-dimensional datasets in the FLASH HDF or HDF5 formats. Both plotfiles and checkpoint files are supported, as they differ only in the number of variables stored, and possibly the numerical precision. Since IDL does not directly support HDF5, the call_external function is used to interface with a set of C routines that read in the HDF5 data (see section 5). Using these routines requires that the HDF5 library be installed on your system and that the shared-object library be compiled before reading in data. Since the call_external function is used, the demo version of IDL will not run the routines.
For basic plotting operations, the easiest way to generate plots of FLASH data is to use one of the widget interfaces: xflash1d for 1d data, xflash for 2d data, and xflash3d for 3d data. For more advanced analysis, the read routines can be used to read the data into IDL, and it can then be accessed through the common blocks. Examples of using the fidlr routines from the command line are provided in § 15.6. Additionally, some scripts demonstrating how to analyze FLASH data using the fidlr routines are described in Section 15.5 (see for example radial.pro).
The driver routines provided with fidlr visualize the AMR data by first converting it to a uniform mesh. This allows for ease of plotting and manipulation, including contour plotting, but it is less efficient than plotting the native AMR structure. Analysis can still be performed directly on the native data through the command line.
fidlr is distributed with FLASH and contained in the tools/fidlr/ directory. In addition to the IDL procedures, a README file is present which will contain up-to-date information on changes and installation requirements.
These routines were written and tested using IDL v5.3 for IRIX. They should work without difficulty on any UNIX machine with IDL installed-any functionality of fidlr under Windows is purely coincidental. Later versions of IDL no longer support GIF files due to copyright difficulties, so outputting to a image file may not work. At present, we do not have access to this version, but it should be possible to replace the GIF functionality with PNG images. It is possible to run IDL in a `demo' mode if there are not enough licenses available. Unfortunately, some parts of fidlr will not function in this mode, as certain features of IDL are disabled. This guide assumes that you are running the full version of IDL.
Installation of fidlr requires defining some environment variables, and compiling the support for HDF5 files. These procedures are described below.
The FLASH IDL routines are located in the tools/fidlr/ subdirectory of the FLASH root directory. To use them you must set two environment variables. First set the value of XFLASH_DIR to the location of the FLASH IDL routines; for example, under csh, use
setenv XFLASH_DIR flash-root-path/tools/fidlrwhere flash-root-path is the absolute path of the FLASH root directory. This variable is used in the plotting routines to find the customized color table for xflash, as well as to identify the location of the shared-object libraries compiled for HDF5 support.
Next, make sure you have an IDL_DIR environment variable set. Ths should point to directory in which the IDL distribution is installed. For example, if IDL is installed in idl-root-path, then you would define:
setenv IDL_DIR idl-root-path
Finally, you need to tell IDL where to find the fidlr routines. This is accomplished through the IDL_PATH environment variable:
setenv IDL_PATH ${XFLASH_DIR}:${IDL_DIR}:${IDL_DIR}/lib
If you already have an IDL_PATH environment variable defined, just add XFLASH_DIR to the beginning of it. You may wish to include these commands in your .cshrc or .profile file (depending on your shell) to avoid having to reissue them every time you log in.
For fidlr to read HDF5 files, you need to install the HDF5 library on your machine and compile the wrapper routines. The HDF5 libraries can be obtained in either source or binary form for most Unix platforms from http://hdf.ncsa.uiuc.edu. The Makefile with fidlr supplied will create the shared-object library on an SGI, but will need to be edited for other platforms. The compiler flags for a shared-object library for different versions of Unix can be found in
${IDL_DIR}/external/call_external/C/callext_unix.txtThe compilation flags in the Makefile should be modified according to the instructions in that file. Additionally, the Makefile needs to know where the HDF5 library is installed as well. This is set through the HDF5path definition in the Makefile.
It is important that you compile the shared-object to conform to the same application binary interface (ABI) that IDL was compiled with. On an SGI, IDL version 5.2.1 and later use the n32 ABI, while versions before this are o32. The HDF library will also need to be compiled in the same format. You can check the format of the HDF5 library and your version of IDL with the UNIX file command.
IDL interacts with external programs through the call_external function. Any arguments are passed by reference through the standard C command line argument interface, argc and argv. These are recast into pointers of the proper type in the C routines. The C wrappers call the appropriate HDF functions to read the data and return it through the argc pointers.
fidlr uses 8-bit color tables for all of its plotting. On displays with higher color depths, it may be necessary to use color overlays to get the proper colors on your display. For SGI machines, launching IDL with the start.pro script with enable 8-bit pseudocolor overlays. For Linux boxes, setting the X color depth to 24-bits per pixel and launching IDL with the start_linux.pro script usually produces proper colors.
The fidlr routines access the data contained in the FLASH output files through a series of common blocks. For the most part, the variable names are consistent with the names used in FLASH. It is assumed that the user is familiar with the block structured AMR format employed by FLASH (refer to § 5 for full details of the output format). All of the unknowns are stored together in the unk data structure, which mirrors that contained in FLASH itself. The list of variable names is contained in the string array unk_names. A utility function var_index is provided to convert between a 4-character string name and the index into unk. For example,
unk[var_index('dens'),*,*,*,*]selects the density from the unk array. Recall that IDL uses zero-based indexing for arrays.
fidlr will attempt to create derived variables from the variables stored in the output file if it can. For example, the total velocity can be created from the components of the velocity, if they are all present. A separate data structure varnames contains the list of variables in the output file plus any additional variables fidlr knows how to derive.
At present, the global common blocks defined by fidlr are:
common size, tot_blocks, nvar, nxb, nyb, nzb, ntopx, ntopy, ntopzThese can be defined on the command line by executing the def_common script provided in the fidlr directory. This allows for analysis of the FLASH data through the command line with any of IDLs built in functions. For example, to find the minimum density in the leaf blocks of the file filename_hdf_chk_0000, you could execute:
common time, time, dt
common tree, lrefine, nodetype, gid, coord, size, bnd_box
common vars, unk, unk_names
common particles, numParticles, particles
IDL> .run def_commonHere, in addition to the unk data-structure, we used the nodetype array, which is set to 1 if a block is a leaf (as it is in FLASH).
IDL> read_amr, 'filename_hdf_chk_0000'
IDL> leaf_blocks = where(nodetype EQ 1)
IDL> print, min(unk[var_index('dens'),*,*,*,leaf_blocks])
The main interface to the fidlr routines for plotting 2-dimensional data sets is xflash. xflash produces colormap plots of FLASH data with optional velocity vectors, contours, and the AMR block structure overlaid. The basic operation of xflash is to specify a file or range of files, probe the file for the list of variables it contains (through the discover variables button described below), and then specify the remaining plot options.
xflash can output to the screen, postscript, or an image file (GIF). If the data is significantly higher resolution than the output device, then xflash (through xplot_amr.pro) will sub-sample the image by one or more levels of refinement before plotting.
Once the image is plotted, the query button will become active. Pressing query and then clicking anywhere in the domain will pop up a window containing the values of all the FLASH variables in the zone nearest the cursor. The query function uses the actual FLASH data-not the interpolated/uniformy gridded data generated for the plots
Typing xflash at the IDL command prompt will launch the main xflash widget, shown in Figure 35(a). The widget is broken into several sections, with some features initially disabled. These sections are explained below.
File Options
The file to be visualized is composed of the path, the basename (the same base name used in the flash.par) plus any file type information appended to it (ex. 'hdf_chk_') and the range of suffices to loop through. An optional parameter, step, controls how many files to skip when looping over the files. By default, xflash sets the path to the working directory that IDL was started from. xflash will spawn gzip if the file is stored on disk compressed and the file is gzipped box is checked. After the file is read, it will be compressed again.
Once the file options are set, clicking on discover variables will read the variable list from the first file specified, and use the variable names to populate the variable selection box. xflash will automatically determine if the file is an HDF or HDF5 file, and read the `unknown names' record to get the variable list. This will work for both plotfiles and checkpoint files generated by FLASH.
Output Options
A plot can be output to the screen (default), a Postscript file, or a GIF file. The output filenames are composed from the basename + variable name + suffix. For outputs to the screen or GIF, the plot size options allow you to specify the image size in pixels. For Postscript output, xflash chooses portrait or landscape orientation depending on the aspect ratio of the domain.
Problem
The problem dropbox allows you to select one of the predefined problem defaults. This is provided soley for convienence, as users frequently want to plot the same problem using the same data ranges. This will load the options (data ranges, velocity parameters, and contour options) for the problem as specified in the xflash_defaults procedure. When xflash is started, xflash_defaults is executed to read in the known problem names. The data ranges and velocity defaults are then updated. To add a problem to xflash, only the xflash_defaults procedure needs to be modified. The details of this procedure are provided in the comment header in xflash_defaults. It is not necessary to add a problem in order to plot a dataset, as all default values can be overridden through the widget.
Variables
The variables dropbox lists the variables stored in the `unknown names' record in the data file, and any derived variables that xflash knows how to construct from these variables (ex: sound speed). This allows you to choose the variable to be plotted. By default, xflash reads all the variables in a file, so switching the variable to plot can be done without rereading. At present, there is no easy way to add a derived variable. Both the widget routine ( xflash.pro) and the plotting backend (xplot_amr.pro) will need to be told about any new derived variables. Users wishing to add derived variables should look at how the total velocity ( tot_vel) is computed.
Colormap
The colormap dropbox lists the available colormaps to xflash. These colormaps are stored in flash_colors.tbl in the fidlr directory, and differ from the standard IDL colormaps. The first 12 colors in the colormaps are reserved by xflash to hold the primary colors used for different aspects of the plotting. Additional colormaps can be created by using the xpalette function in IDL. It is suggested that new colormaps use one of the existing colormaps as a template, to preserve the primary FLASH colors.
Options
The options block allows you to toggle various options on/off (Table 39).
log | Plot the log of the variable. |
max | When looping over a sequence of files, plot the max of the variable in each zone over all the files. |
annotate | Toggle the title and time information. |
abs value | Plot the absolute value of the dataset. This operation is performed before taking the log. |
show blocks | Draw the block boundaries on the plot. |
colorbar | Plot the colorbar legend for the data range. |
show ticks | Show the axis/tick marks on the plot. |
Data Range
These fields allow you to specify the range of the variable to plot. Data outside of the range will be scaled to the minimum and maximum values of the colormap respectively.
Contour Options
This launches a dialog box that allows you to select up to 4 contour lines to overplot of the data (see figure 36). The variable, value, and color are specified for each reference contour. To plot a contour, select the check box next to the contour number. This will allow you to set the variable to make the contour from, the value of the contour, and the color.
|
|
Velocity Options
This launches a dialog box that allows you to set the velocity options used to plot velocity vectors on the plot (see figure 37). The plotting of velocity vectors is controlled by the partvelvec.pro procedure. xskip and yskip allow you to thin out the arrows. typical velocity sets the velocity to scale the vectors to, and minimum velocity and maximum velocity specify the range of velocities to plot vectors for.
|
|
Particle Options
If a FLASH output file contains particle data, the particle options button will be made sensitive. This button launches a dialog box that allows you to set the particle options used to plot the particles a plot (see figure 38). The plotting of particles vectors is controlled by the partvelvec.pro procedure. The position of a particle is marked with a circle, whose size can be controlled by the symbol size slider. The typical velocity field allows you to adjust the scaling of the particle velocity vectors.
|
|
Zoom
The zoom options allow you to set the domain limits for the plot. A value of -1 uses the actual limit of the domain.
Plot
Create the plot. The status of the plot will appear on the status bar at the bottom.
Query
The query button becomes active once the plot is created. Clicking on
query and then clicking somewhere in the domain lists the
data values at the cursor location (see Figure 39).
|
|
xflash3d provides an interface for plotting slices of 3-dimension FLASH datasets. It is written to conserve memory-only a single variable is read in at a time, and only the slice to be plotted is put on a uniform grid. The merging of the AMR data to a uniform grid is accomplished by merge3d_amr.pro which can take a variety of arguments which control what volume of the domain is to be put onto a uniform grid.
|
|
Figure 40 shows the main xflash3d widget. The interface is similar to that of xflash, so the above description from that section still applies. The major difference is in the zoom section, which controls the slice plane.
The slice plane button group controls the plane to plot the data in. For a given plane, the direction orthogonal will have only on active text box to enter zoom information-this controls the position of the slice plane in the normal direction. For the directions in the slice plane, the zoom boxes allow you to select a sub-domain in that plane to plot.
Table 40 lists all of the fidlr routines, grouped by function. Most of these routines rely on the common blocks to get the tree structure necessary to interpret a FLASH dataset. Command line analysis of FLASH data requires that the common blocks be defined, usually by executing def_common.pro.
| |
FLASH data readers | |
read_amr.pro | Read in FLASH data in HDF 4 format. This routine takes the filename and an optional variable argument and returns then tree, grid, and unknown information in the fidlr common blocks. |
read_amr_hdf5.pro | The HDF5 version of read_amr.pro. This routine uses the call_external function to access C wrappers of the HDF functions stored in h5_wrappers.so. The shared library must be compiled before using this routine. |
file_information.pro | Dump out some basic information about the file, such as the number of variables store, the runtime comment, the precision of the data, etc. |
Driver routines | |
xflash.pro | The main driver for 2d datasets. xflash provides a widget interface to select the variable, data range, contour options, output type, etc. This routine uses xflash_defaults.pro to define some default problem types and their options. Once the options are selected, xplot_amr.pro is used to create the plot. |
xflash1d.pro | The main driver for 1d datasets. This widget accepts options and passes them onto xplot1d_amr.pro. |
xflash3d.pro | The main driver for 3d datasets. This widget allows you to select the cut plane (x-y, x-z, y-z) and set the data options. xplot3d_amr.pro is used to create the plots. |
xflash_defaults.pro | The problem default initialization file. Standard problems are given an entry in this file, defining the default values for the plot options. This file is read in by xflash, xflash1d, and xflash3d when the problem name is changed. |
xcontour.pro | The contour options widget. This allows you to select up to 4 reference contours to be overplotted on a 2d plot. This widget is launched by xflash.pro. |
xvelocity.pro | The velocity options widget. This allows you to select the minimum, maximum, and typical velocities, and the number of zones to skip when thinning out the vector field. This widget is launched by xflash.pro. |
xparticle.pro | The particle options widget. This widget is launched by xflash.pro. |
batch.pro | An example script showing how to convert a 3d FLASH dataset to a uniformly gridded single byte block of data suitable for 3d visualization packages. The block of data is written in Fortran binary format. |
radial.pro | An example script showing how to read in 2d data and plot a variable as a function of radius from a given point. |
compare.pro | A script showing how to loop over a set of files and plot a series of 1d slices of a variable vs distance. |
flame_profile.pro | A script that reads in a 2-d FLASH dataset and writes out a 1d slice of data to an (ASC)I file. |
flame_profile_1d.pro | A script that reads in a 1-d FLASH dataset and writes out a 1d slice of data to an (ASC)I file. The format of the output file is identical to that required by the sample_map setup. |
flame_speed.pro | Read in two FLASH files and compute the speed of a planar front by differencing. |
HDF routines | |
hdf_read.pro | Wrapper around IDL HDF 4 routines to read in a dataset given the file handle and dataset name. |
determine_file_type.pro | IDL procedure to determine if a file is in HDF 4 format (return 1), HDF5 format (return 2), or neither (return -1). This routine uses the built in IDL HDF 4 implementation and some of the HDF5 wrappers in h5_wrappers.so. |
Makefile | Makefile for compiling the HDF5 support on an SGI IRIX. Other machines should behave similarly, but some of the compilation flags may differ. |
h5_file_interface.c | HDF5 file open and close routines from the FLASH serial HDF5 implementation. |
h5_read.c | Part of the serial HDF5 FLASH routines, used to read in the header information from the data file. |
h5_wrappers.c | Wrappers around the HDF5 library to read in the different records from the FLASH HDF5 file. |
h5_wrappers.so | Shared-object library produced from the above routines. The IDL routines interface with this object through the call external function. |
hdf5_idl_interface.h | Header file for the C wrappers. |
Merging routines | |
merge3d_amr.pro | Merge routine for 3d block structured AMR data. This routine will resample the FLASH data to put it on a uniform resolution. If the range keywords are used, a uniform slice can be created. |
merge3d_amr_crn.pro | As above, but operates on data stored at the zone corners. This simply averages the data to the zone centers before proceeding. |
merge_amr.pro | 2d merging routine, put a dataset onto a uniform grid. |
Plotting routines | |
draw_blocks.pro | Draw the AMR block boundaries on the plot. This routine is called from xflash. |
vcolorbar.pro | Create a vertical colorbar given the data range and color bounds. |
colorbar2.pro | Create a horizontal colorbar. |
partvelvec.pro | Overplot the velocity vectors, for velocities that fall within a specified minimum and maximum velocity. The vectors are scaled to a typical velocity. This routine also handles the plotting of the particle data. |
xplot1d_amr.pro | Back end to xflash1d-create a plot of a 1d FLASH dataset, using the options selected in the widget. |
xplot3d_amr.pro | Back end to xflash3d-plot a slice through a 3d FLASH dataset. |
xplot_amr.pro | Back end to xflash-plot a 2d dataset. |
Utility routines | |
add_var.pro | add_var is used to add a derived variable to the list of variables recognized by the xflash routines. |
color.pro | color returns the index into the color table of a color specified by a string name. |
color_gif.pro | Create a gif of the current plot window. |
courant.pro | Loop over the blocks and return the block number where the Courant condition is set. |
def_common.pro | Define the common blocks that hold the variables read in from the read routines. This routine can be used on the IDL command line so the FLASH data can be analyzed interactively. |
flash_colors.tbl | Replacement color table with the standard FLASH colormaps. |
nolabel.pro | A hack used to plot an axis w/o numbers. |
query.pro | A widget routine called by xflash that displays the data in a cell of the current plot. |
query1d.pro | Query routine for the 1d data, called from xflash1d. |
scale3d_amr.pro | Scale a uniformly gridded 3d dataset into a single byte. |
scale_color.pro | Scale a dataset into a single byte. |
sci_notat.pro | Print a number out in scientific notation. |
start.pro | A script used to initialize IDL on the SGIs. |
tvimage.pro | Replacement for tv that will write to postscript or the screen in device independent manner. |
undefine.pro | Free up the memory used by a variable. |
var_index.pro | Return the index into the unk array of the variable label passed as an argument. |
write_brick_f77.pro | Write a block of data out to a file in f77 binary format. |
Most of the fidlr routines can be used directly from the command line to perform analysis not offered by the different widget interfaces. This section provides an example of using the fidlr routines.
Example. Report on the basic information about a FLASH data file.
IDL> file_information, 'sedov_2d_6lev_hdf_chk_0000' ------------------------------------------------------------------------------- file = sedov_2d_6lev_hdf_chk_0000 FLASH version: FLASH 2.0.20010802 file format: HDF 4 execution date: 08-03-2001 12:59.20 run comment: 2D Sedov explosion, from t=0 with r_init = 3.5dx_min dimension: 2 geometry detected: Cartesian (assumed) type of file: checkpoint number of variables: 12 variable precision: DFNT_FLOAT64 number of particles: 0 nxb, nyb, nzb: 8 8 1 corners stored: no IDL>
Every problem that is run with FLASH requires a directory in FLASH2.0/setups. This is where the FLASH setup script looks to find the problem-specific files. The FLASH distribution includes a number of pre-written setups. However, most new FLASH users will begin by defining a new problem, so it is important to understand the technique for adding a customized problem setup.
Each setups directory contains the routines that initialize the FLASH grid. The directory also includes parameter files that setup uses to select the proper physics modules from the FLASH source tree. When the user runs setup, the proper source files are selected and linked to the object directory (note here to refer to setup script docs).
There are two files that must be included in the setup directory for any problem. These are
Config | lists the modules required for the setup and defines additional runtime parameters. |
init_block.F90 | Fortran routine for setting initial conditions in a single block. |
We will look in detail at these files for an example setup. This is a simple setup that creates a domain with hot ash inside a circle of radius radius centered at (xctr, yctr, zctr). The density is uniformly set at rho_ambient and the temperature is t_perturb inside the circle and t_ambient outside.
To create a new setup, we first create the new directory and then add the Config and init_block.F90 files. The easiest way to construct these files is to use files from another setup as a template.
The simplest way to construct a Config file is to copy one from another setup that incorporates the same physics as the new problem. Config serves two principal purposes: (1) to specify the required modules and (2) to register runtime parameters. The Config file for the example problem contains the following:
# configuration file for our example problem REQUIRES driver/time_dep REQUIRES materials/eos/gamma REQUIRES materials/composition/fuel+ash REQUIRES io REQUIRES mesh REQUIRES hydro
These lines define the FLASH modules used by the setup. We are going to carry wo fluids (fuel and ash), so we load the composition module fuel+ash. At runtime, this module will initialize the multifluid database to carry the two fluids, and it will setup their properties. We wish to rely on the defaults for I/O, meshing, and hydrodynamics, and select the simple gamma law eos ( materials/eos/gamma) for this problem.
Additional modules may be added to the Config file using this syntax. For example, to use the 13 isotope alpha-chain network, add the line:
REQUIRES source_terms/burn/aprox13To add constant gravity, add the line:
REQUIRES gravity/constantAfter defining the modules, the Config file lists any runtime parameters specific to this problem:
# runtime parameters PARAMETER rho_ambient REAL 1. PARAMETER t_ambient REAL 1. PARAMETER t_perturb REAL 5. PARAMETER radius REAL 0.2 PARAMETER xctr REAL 0.5 PARAMETER yctr REAL 0.5 PARAMETER zctr REAL 0.5
Here we define the ambient density (rho_ambient), the ambient and perturbed temperatures (t_ambient, t_perturb), the radius of the perturbed region (radius), and the coordinates of the center of the perturbation (xctr, yctr, zctr). All of these parameters are floating point numbers. We also give the default values for each parameter.
The routine init_block (or any other FLASH function) can access
any of these variables through a simple database call. The default
value of any parameter (like rho_ambient) can be overridden at
runtime by specying a different value in a file flash.par as
line
rho_ambient = 100.
All parameters required for initialization of the new problem should
be added to Config.
The routine init_block is called by the framework to initialize data in each AMR block. The framework first forms the grid at the lowest level of refinement and calls init_block to initialize the data in each block. The code checks the refinement criteria in each block refines, and calls init_block to initialize the newly created blocks. This process repeats until it reaches the maximum refinement level in the areas marked for refinement.
The basic structure of the routine init_block should consist of
|
subroutine init_block(block_no) ! ! sample init_block -- initialize a circle with high temperature fuel ! surrounded by ash. ! use multifluid_database, ONLY: find_fluid_index use runtime_parameters, ONLY: get_parm_from_context, GLOBAL_PARM_CONTEXT use dBase, ONLY: nxb, nyb, nzb, nguard, ionmax, & k2d, k3d, ndim, & dBasePropertyInteger, & dBaseKeyNumber, dBaseSpecies, & dBaseGetCoords, dBasePutData
Next come the local declarations. In this example, there are loop indices, one dimensional scratch arrays, integer keys that will be used in the database calls, and other scratch variables needed for the initialization.
implicit none integer :: i, j, k, block_no, n logical, save :: firstCall = .TRUE. real, save :: smallx ! variables needed for the eos call real :: temp_zone real :: pel, eel, ptot, eint, abar, zbar real :: dpt, dpd, ded, det, gamma, xalfa, xxni, xxne, xxnp integer, save :: iXvector, iYvector, iZvector integer, save :: iXcoord, iYcoord, iZcoord integer, save :: iPoint integer, save :: izn real :: dist integer, save :: idens, itemp, ipres, iener, igame, igamc integer, save :: ivelx, ively, ivelz, inuc_begin integer, save :: ifuel, iash ! save the parameters that describe this initialization real, save :: rho_ambient, t_ambient, t_perturb real, save :: radius real, save :: xctr, yctr, zctr ! compute the maximum length of a vector in each coordinate direction ! (including guardcells) integer, parameter :: q = max(nxb+2*nguard, & nyb+2*nguard*k2d, & nzb+2*nguard*k3d) real, dimension(q) :: x, y, z real :: xx, yy, zz real, dimension(q) :: rho, p, t, game, gamc, vx, vy, vz, e real, dimension(ionmax) :: xn integer, save :: MyPE, MasterPE
Please note that FLASH promotes all floating point variables to double precision at compile time for maximum portability. We therefore declare all floating point variables with real in the source code. Note also that a lot of these variables are explicitly saved. These variables will not change through the simulation. They include the runtime parameters that we defined above, and the keys that will be used in database calls (e.g. idens).
The variable (firstCall) is true the first time through this init_block, when these saved variables will be filled, and then set to be false for subsequent entries into init_block.
The next part of the code is to make the database calls to get the values we need to initialize the domain. In addition to the runtime parameters and any physical constants, we also create integers keys that will be used in the variable database calls. Most of the database calls are overloaded to accept either a string or an integer key to select a variable. String comparisons are expensive, so we make them once, when getting the key, and save the result for later use.
if (firstCall) then MyPE = dBasePropertyInteger('MyProcessor') MasterPE = dBasePropertyInteger('MasterProcessor') !----------------------------------------------------------------------------- ! grab the parameters relevant for this problem !----------------------------------------------------------------------------- call get_parm_from_context(GLOBAL_PARM_CONTEXT, 'smallx', smallx) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 'rho_ambient', rho_ambient) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 't_ambient', t_ambient) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 't_perturb', t_perturb) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 'radius', radius) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 'xctr', xctr) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 'yctr', yctr) call get_parm_from_context(GLOBAL_PARM_CONTEXT, 'zctr', zctr)
It is sometimes useful to have the init_block routine print some output, such as echoing runtime parameters to the screen. This is best done in the firstCall block.
if (MyPE == MasterPE) then print *, 'Initializing the example setup' endif
It is also useful to do some error checking to make sure the code was setup the way you intended when the init_block was written. The function abort_flash will print out an error message and abort the code.
if (ionmax /= 2) then call abort_flash('Error: ionmax /= 2 in init_block') endif
Next we get integer keys for the different database calls we will be making. Most of the database calls are overloaded to accept a string or an integer to specify which variable is being stored, the coordinate direction, etc. We do the string to integer conversion here, so it is only executed once each time FLASH is run.
!----------------------------------------------------------------------------- ! get the pointers into the solution vector !----------------------------------------------------------------------------- idens = dBaseKeyNumber('dens') ivelx = dBaseKeyNumber('velx') ively = dBaseKeyNumber('vely') ivelz = dBaseKeyNumber('velz') iener = dBaseKeyNumber('ener') ipres = dBaseKeyNumber('pres') itemp = dBaseKeyNumber('temp') igame = dBaseKeyNumber('game') igamc = dBaseKeyNumber('gamc') inuc_begin = dBaseSpecies(1) call find_fluid_index('fuel', ifuel) call find_fluid_index('ash', iash) iXvector = dBaseKeyNumber('xVector') iYvector = dBaseKeyNumber('yVector') iZvector = dBaseKeyNumber('zVector') iPoint = dBaseKeyNumber('Point') iXcoord = dBaseKeyNumber('xCoord') iYcoord = dBaseKeyNumber('yCoord') iZcoord = dBaseKeyNumber('zCoord') izn = dBaseKeyNumber('zn') firstCall = .FALSE. endif
The next part of the routine involves setting up the initial conditions. This could be code for interpolating a given set of initial conditions, constructing some analytic model, or reading in a table of initial values.
In the present example, we begin by getting the coordinates for the zones in the current block. This is done by a set of calls to dBaseGetCoords. The key izn that we defined above in the lookup of ``zn''tells the database that we want the coordinate of the zone centers. We define the direction with iXcoord, iYcoord, and iZcoord which we also set in the lookups above. The results are stored in the vectors x, y, and z.
x(:) = 0.0 y(:) = 0.0 z(:) = 0.0 if (ndim == 3) call dBaseGetCoords(izn, iZcoord, block_no, z) if (ndim >= 2) call dBaseGetCoords(izn, iYcoord, block_no, y) call dBaseGetCoords(izn, iXcoord, block_no, x)
Next comes a set of loops (one for each dimension) over all the interior zones in the block. We note that the loops make use of the k2d parameter, which is equal to 1 for 2 and 3-d simulations and 0 otherwise, and the K3D parameter, which is equal to 1 only for 3-d simulations. This provides a convenient way to construct a general set of loops that will work regardless of the dimensionality. Inside these loops, the values of the density, velocity, velocity, abundances, ... are set. We also usually make a call to the equation of state to ensure that these quantities are thermodynamically consistent.
!----------------------------------------------------------------------------- ! loop over all the zones in the current block and set the temperature, ! density, and thermodynamics variables. !----------------------------------------------------------------------------- do k = nguard*k3d+1, nguard*k3d+nzb zz = z(k) do j = nguard*k2d+1, nguard*k2d+nyb yy = y(j) do i = nguard+1, nguard+nxb xx = x(i)
For the present problem, we are making a hot circular region of fuel. We want to compute the distance of the current zone from the center of the circular region, test whether we are inside the circle, and set the temperature and composition accordingly. Remember that we know the value of the runtime parameters we setup in the Config file from the calls to get_parm_from_context made above.
!----------------------------------------------------------------------------- ! compute the distance from the center -- handle this specially for 1, 2, and ! 3 dimensions. !----------------------------------------------------------------------------- if (ndim == 1) then dist = xx - xctr elseif (ndim == 2) then dist = sqrt((xx-xctr)**2 + (yy-yctr)**2) else dist = sqrt((xx-xctr)**2 + (yy-yctr)**2 + (zz-zctr)**2) endif if (dist <= radius) then temp_zone = t_perturb xn(ifuel) = smallx xn(iash) = 1.0 - smallx else temp_zone = t_ambient xn(ifuel) = 1.0 - smallx xn(iash) = smallx endif
We now have the density, composition, and temperature for the current zone. We can find the pressure, internal energy, and gamma corresponding to these value from a call to the equation of state.
!----------------------------------------------------------------------------- ! get the pressure and internal energy corresponding to the ambient density ! and perturbed temperature !----------------------------------------------------------------------------- call eos(rho_ambient,temp_zone,ptot,eint,xn, & abar,zbar,dpt,dpd,det,ded,gamma,pel,xxne, & xalfa,1) rho(i) = rho_ambient t(i) = temp_zone vx(i) = 0.0 vy(i) = 0.0 vz(i) = 0.0 p(i) = ptot e(i) = eint + 0.5*(vx(i)**2 + vy(i)**2 + vz(i)**2) game(i) = p(i)/(eint*rho(i)) + 1.0 gamc(i) = gamma
We note that the energy stored by FLASH is the total energy density, so we add the kinetic energy contribution to the internal energy returned from the EOS call. In the present case, the kinetic energy is zero, since all of our velocities are 0, this step is shown for completeness.
Now that we have the correct state for the current zone we want to put these values back into the database. We show two methods here. First, the composition is stored one point at a time, using a call to dBasePutData. We use the key inuc_begin which we looked up above as the starting key for the composition variables. We use the fact that the composition variables have contiguous keys to create a loop over all species.
We exit the inner loop (over the x-coordinate) and store the remaining variables a vector at a time. This is also done with the dBasePutData function, but this time using the iXvector key instead of iPoint.
!----------------------------------------------------------------------------- ! finally, fill the solution array !----------------------------------------------------------------------------- do n=1,ionmax call dBasePutData(inuc_begin-1+n,ipoint, & i, j, k, block_no, xn(n)) enddo enddo call dBasePutData(idens, iXvector, j, k, block_no, rho) call dBasePutData(iener, iXvector, j, k, block_no, e) call dBasePutData(itemp, iXvector, j, k, block_no, t) call dBasePutData(ipres, iXvector, j, k, block_no, p ) call dBasePutData(ivelx, iXvector, j, k, block_no, vx ) call dBasePutData(ively, iXvector, j, k, block_no, vy ) call dBasePutData(ivelz, iXvector, j, k, block_no, vz ) call dBasePutData(igame, iXvector, j, k, block_no, game) call dBasePutData(igamc, iXvector, j, k, block_no, gamc) enddo enddo return end subroutine init_block
When init_block returns, the database will now have the values of the initial model for the current block. init_block will be called for every block that is created as the code refines the initial model.
We encourage you to run the example setup to see this code in action. This setup can be used as the basis for a much more complicated problem. For a demonstration of how to initialize the domain with a one-dimensional initial model, look at the sample_map setup.
More generally, a setup also may include customized versions of some of the FLASH routines or other routines. Examples of FLASH routines that may be customized for a particular problem are
init_1d.F90 | a routine that reads in a 1-d initial model file. |
init_mat.F90 | Fortran routine for initializing the materials module. |
Makefile | The Make include file for the setup. This file is the Makefile for any problem-specific routines that are not part of the standard FLASH distribution (like init_1d above). |
mark_ref.F90 | Fortran routine for marking blocks to be refined, modified for this specific problem. |
An additional file required to run the code is flash.par. It contains flags and parameters for running the code. Copies of flash.par may be kept in the setup directory.
The file flash.par is read at runtime and sets flags and (to non-default values) the values of runtime parameters.
The flash.par file for the example setup is
# Parameters for the example setup rho_ambient = 1.0 t_ambient = 1.0 t_perturb = 10. radius = .2 # for starting a new run restart = .false. cpnumber = 0 ptnumber = 0 # dump checkpoint files every trstrt seconds trstrt = 4.0e-4 # dump plot files every tplot seconds tplot = 5.0e-5 # go for nend steps or tmax seconds, whichever comes first nend = 1000 tmax = 1.0e5 # initial, and minimum timesteps dtini = 1.0e-16 dtmin = 1.0e-20 dtmax = 1.0e2 # Grid geometry igeomx = 0 igeomy = 0 igeomz = 0 # Size of computational volume xmin = 0.0 xmax = 1.0 ymin = 0.0 ymax = 1.0 zmin = 0.0 zmax = 1.0 # Boundary conditions xl_boundary_type = "outflow" xr_boundary_type = "outflow" yl_boundary_type = "outflow" yr_boundary_type = "outflow" zl_boundary_type = "outflow" zr_boundary_type = "outflow" # Variables for refinement test refine_var_1 = "dens" refine_var_2 = "pres" refine_var_3 = "none" refine_var_4 = "none" # Refinement levels lrefine_max = 3 lrefine_min = 1 # Number of lowest-level blocks nblockx = 1 nblocky = 1 nblockz = 1 # Hydrodynamics parameters cfl = 0.8 # Simulation-specific parameters basenm = "example_3lev_" run_number = "001" run_comment = "A simple FLASH 2.0 example" log_file = "flash_example.log"
In this example, there are flags set for a ``cold start'' of the simulation, grid geometry, boundary conditions, and refinement. There are also parameters set for the ambient temperature and density, as well as for details of the run such as the number of timesteps between checkpoint files, the initial, minimum and final time steps, and the minimum and maximum temperatures and densities.
setup produces files named paramFile.txt and paramFile.html each time setup is run. These files list all possible runtime parameters and the values to which they were set initially, as well as a brief description of the parameters.
Running the example setup with the defaults option and the flash.par provided will produce five checkpoint files and 29 plot files. The initial temperature distribution, as visualized by the magic of the fidlr tools, appears in Figure 41
Adding new solvers (either for new or existing physics)
to FLASH is similar in some ways to adding a problem
configuration. In general one creates a subdirectory for the solver, placing
it under the source subdirectory for the parent module if the solver implements
currently supported physics, or creating a new module subdirectory if it
does not. Put the source files required by the solver into this directory,
then create the following files:
Makefile: The make include file for the module should set a macro with the name of the module equal to a list of the object files in the module. Optionally (recommended), add a list of dependencies for each of the source files in the module. For example, the source_terms module's make include file is
# Makefile for source term solvers source_terms = source_termsModule.o burn.o heat.o cool.o init_burn.o init_heat.o \ init_cool.o tstep_burn.o tstep_heat.o tstep_cool.o init_src.o source_termsModule.o : source_termsModule.F90 dBase.o burn.o : burn.F90 heat.o : heat.F90 cool.o : cool.F90 init_src.o : init_src.F90 dBase.o init_burn.o : init_burn.F90 init_heat.o : init_heat.F90 init_cool.o : init_cool.F90 tstep_burn.o : tstep_burn.F90 tstep_heat.o : tstep_heat.F90 tstep_cool.o : tstep_cool.F90Sub-module make include files use macro concatenation to add to their parent modules' make include files. For example, the source_terms/burn sub-module has the following make include file:
# Makefile for the nuclear burning sub-module source_terms += burn_block.o net_auxillary.o net_integrate.o sparse_ma28.o \ gift.o net.o shock_detect.o burn_block.o : burn_block.F90 dBase.o network_common.fh eos_common.fh net.o net_auxillary.o : net_auxillary.F90 network_common.fh eos_common.fh net_integrate.o : net_integrate.F90 network_common.fh sparse_ma28.o : sparse_ma28.F90 net.o : net.F90 network_common.fh eos_common.fh gift.o : gift.F90 shock_detect.o : shock_detect.F90 dBase.o network_common.fh : network_size.fh touch network_common.fh # Additional dependencies burn.o : dBase.o network_common.fh net.o init_burn.o : dBase.o network_common.fh net.oNote that the sub-module's make include file only makes reference to files provided at its own level of the directory hierarchy. If the sub-module provides special versions of routines to override those supplied by the parent module, they do not need to be mentioned again in the object list, because the sub-module's Makefile is contenated with its parent's. However, if these special versions have additional dependencies, they can be specified as shown. Of course, any files supplied by the sub-module that are not part of the parent module should be mentioned in the sub-module's object list, and their dependencies should be included.
my_module =If sub-modules of my_module exist and are requested, their Makefiles will be appended to this base.
This is all that is necessary to add a module or sub-module to the code. However, it is not sufficient to have the module routines called by the code! If you are creating a new solver for an existing physics module, the module itself should provide the interface layer to the rest of the code. As long as your sub-module provides the routines expected by the interface layer, the sub-module should be ready to work. However, if you are adding a new module (or if your sub-module has externally visible routines - a no-no for the future), you will need to add calls to your externally visible routines. It is difficult to give completely general guidance; here we simply note a few things to keep in mind.
If you wish to be able to turn your module on or off without recompiling the code, create a new runtime parameter (e.g., use_module) in the driver module. You can then test the value of this parameter before calling your externally visible routines from the main code. For example, the burn module routines are only called if (iburn .eq. 1). (Of course, if the burn module is not included in the code, setting iburn to 1 will result in empty subroutine calls.)
You will need to add use dBase if you wish to have access to the global AMR data structure. Since this is the only mechanism for operating on the grid data (which is presumably what you want to do) in FLASH 1.x, you will probably want to do this. An alternative, if your module uses a pre-existing data structure, is to create an interface layer which converts the PARAMESH-inspired tree data structure used by FLASH into your data structure, then calls your routines. This will probably have some performance impact, but it will enable you to quickly get things working.
You may wish to create an initialization routine for your module which is called before anything (e.g., setting initial conditions) is done. In this case you should call the routine init_module() and place a call to it (without any arguments) in the main initialization routine, init_flash.F90, which is part of the driver module. Be sure this routine has a stub available.
If your solver introduces a constraint on the timestep, you should create a routine named tstep_module() which computes this constraint. Add a call to this routine in timestep.F90 (part of the driver/time_dep module), using your global switch parameter if you have created one. See this file for examples. Your routine should operate on a single block and take three parameters: the timestep variable (a real variable which you should set to the smaller of itself and your constraint before returning), the minimum timestep location (an integer array with five elements), and the block identifier (an integer). Returning anything for the minimum location is optional, but the other timestep routines interpret it in the following way. The first three elements are set to the coordinates within the block of the zone contributing the minimum timestep. The fourth element is set to the block identifier, and the fifth is set to the current processor identifier (MyPE). This information tags along with the timestep constraint when blocks and solvers are compared across processors, and it is printed on stdout by the master processor along with the timestep information as FLASH advances.
If your solver is time-dependent, you will need to add a call to your solver in the evolve() routine (driver/time_dep/evolve.F90). If it is time-independent, add the call to driver/steady/flash.F90. evolve() implements second-order Strang time splitting within the time loop maintained by driver/time_dep/flash.F90. The steady version of flash.F90 simply calls each operator once and then exits.
Try to limit the number of entry points to your module. This will make it easier to update it when the next version of FLASH is released. It will also help to keep you sane.
Porting FLASH to new architectures should be fairly straightforward for most Unix or Unix-like systems. For systems which look nothing like Unix, or which have no ability to interpret the setup script or makefiles, extensive reworking of the meta-code which configures and builds FLASH would be necessary. We do not treat such systems here; rather than do so, it would be simpler for us to do the port ourselves. The good news in such cases is that, assuming that you can get the source tree configured on your system and that you have a Fortran 90 compiler and the other requirements discussed in Section 2, you should be able to compile the code without making too many changes to the source. We have generally tried to stick to standard Fortran 90, avoiding machine-specific features.
For Unix-like systems, you should make sure that your system has csh, a gmake which permits included makefiles, awk, sed, and python. You will need to create a directory in source/sites/ with the name of your site (or a directory in source/sites/Prototypes/ with the name of your operating system). At a minimum this directory should contain a makefile fragment named Makefile.h. The best way to start is to copy one of the existing makefile fragments to your new directory and modify that. Makefile.h sets macros which define the names of your compilers, the compiler and linker flags, the names of additional object files needed for your machine but not included in the standard source distribution, and additional shell commands (such as file renaming and deletion commands) needed for processing the master makefile.
For most Unix systems this will be all you need to do. However, in addition to Makefile.h you may need to create machine-specific subroutines which override the defaults included with the main source code. As long as the files containing these routines duplicate the existing routines' filenames, they do not need to be added to the machine-dependent object list in Makefile.h; setup will automatically find the special routine in the system-specific directory and link to it rather than to the general routine in the main source directories.
An example of such a routine is getarg(), which returns command-line arguments and is used by FLASH to read the name of the runtime parameter file from the command line. This routine is not part of the Fortran 90 standard, but it is available on many Unix systems without the need to link to a special library. However, it is not available on the Cray T3E; instead, a routine named pxfgetarg() provides the same functionality. Therefore we have encapsulated the getarg() functionality in a routine named get_arguments(), which is part of the driver module in a file named getarg.F90. The default version simply calls getarg(). For the T3E a replacement getarg.F90 calling pxfgetarg() is supplied. Since this file overrides a default file with the same name, getarg.o does not need to be added to the machine-dependent object list in source/sites/Prototypes/UNICOS/Makefile.h.
FLASH is still under active development, so many desirable features have not yet been included. Also, you will most likely encounter bugs in the distributed code. A mailing list has been established for reporting problems with the distributed FLASH code. To report a bug, send email to
flash-bugs@flash.uchicago.edugiving your name and contact information and a description of the procedure you were following when you encountered the error. Information about the machine, operating system (uname -a), and compiler you are using is also extremely helpful. Please do not send checkpoint files, as these can be very large. If the problem can be reproduced with one of the standard test problems, send a copy of your runtime parameter file and your setup options. This situation is very desirable, as it limits the amount of back-and-forth communication needed for us to reproduce the problem. We cannot be responsible for problems which arise with a physics module or initial model you have developed yourself, but we will generally try to help with these if you can narrow down the problem to an easily reproducible effect which occurs in a short program run and if you are willing to supply your added source code.
We have also established a mailing list for FLASH users. This is a moderated mailing list and is intended both for FLASH users to communicate with each other about general usage issues (other than bugs) and for FLASH developers to announce the availability of new releases. The address for this mailing list is
flash-users@flash.uchicago.edu
Documentation (including this user's guide and a FAQ) and support information for FLASH can be obtained on the World Wide Web at
http://flash.uchicago.edu/flashcode/