Fortuna 2.5 User's Guide
This page describes in detail how to set up and optimize a global model run of GEOS-5 Fortuna 2.5 on NCCS discover and NAS pleiades and generally make the model do what you want. It assumes that you have already run the model as described in Fortuna 2.5 Quick Start.
Compiling the Model
Most of the time for longer runs you will be using a release version of the model, perhaps compiled with a different version of one or more of the model's gridded components, defined by subdirectories in the source code. This process starts with checking out the stock model from the repository using the command
cvs co -r TAGNAME -d DIRECTORY Fortuna
where TAGNAME is the model "tag" (version). A tag in cvs
marks the various versions of the source files in the repository that together make up a particular version of the model. A sample release tag is Fortuna-2_5_p6
, indicating the latest patch of version Fortuna 2.5. DIRECTORY is the directory that the source code tree will be created. If you are using a stock model tag it is reasonable to name the directory the same as the tag. This directory determines which model in presumably your space a particular experiment is using. Some scripts use the environment variable ESMADIR
, which should be set to the absolute (full) pathname of this directory.
When a modified version of some component of the model is saved to the repository, the tag it uses -- different from the standard model tag -- is supposed to be applied at most only to the directories with modified files. This means that if you need to use some variant tag of a gridded component, you will have to cd
to that directory and update to the variant tag. So, for example, if you needed to apply updates to the SatSim gridded component, you would have to cd
several levels down to the directory GEOSsatsim_GridComp
and run
cvs upd -r VARIANT_TAGNAME
The source code will then incorporate the tag's modifications.
Once the checkout from the repository is completed, you are ready to compile. cd
to the src
directory at the top of the source code directory tree and from a csh
shell run source g5_modules
. This will load the appropriate modules and create the necessary environment for compiling and running. It is tailored to the individual systems that GEOS-5 usually runs on, so it probably won't work elsewhere. After that you can run make install
, which will create the necessary executables in the directory ARCH/bin
, where ARCH is the local architecture (most often Linux
).
Setting up a Global Model Run
The following describes how to set up a global model run. The prodcedure to set up a single column model run is described in Fortuna 2.5 Single Column Model.
Using gcm_setup
The setup script for global runs, gcm_setup
, is in the directory src/Applications/GEOSgcm_App
. The following is an example of a session with the setup script, with commentary. :
Enter the Experiment ID:
Enter a name and hit return. For this example we'll set the experiment ID to "myexp42". Experiment IDs need to have no whitespace and not start with a digit, since it will be the prefix of job names and PBS imposes certain limits on job names.
Enter a 1-line Experiment Description:
This should be short but descriptive, since it will be used to label plots. It can have spaces, though the string will be stored with underscores for the spaces. Provide a description and hit return.
Enter the Lat/Lon Horizontal Resolution: IM JM or ..... the Cubed-Sphere Resolution: cNN
The lat/lon option allows four resolutions: 144x91, 288x181, 576x361 or 1152x721 corresponding roughly to 2, 1, 1/2 and 1/4 degree resolutions. Enter a resolution like so:
144 91
and hit enter.
Enter the Model Vertical Resolution: LM (Default: 72)
The current standard is 72 levels, and unless you know what you are doing you should stick with that.
Do you wish to run the COUPLED Ocean/Sea-Ice Model? (Default: NO or FALSE)
You probably don't, so hit enter.
Do you wish to run GOCART? (Default: NO or FALSE)
GOCART is the interactive chemistry package, as opposed to prescribed chemistry. It incurs a significant performance cost, so unless you know you want it, you should go with the default. The following assumes that you have entered "y". Otherwise, skip two steps to "Enter the tag..."
Enter the GOCART Emission Files to use: "CMIP" (Default), "PIESA", or "OPS":
Select your favorite emission files here.
Enter the AERO_PROVIDER: GOCART (Default) or PCHEM:
Here you get to choose again to use interactive or prescribed aerosols.
Enter the tag or directory (/filename) of the HISTORY.AGCM.rc.tmpl to use (To use HISTORY.AGCM.rc.tmpl from current build, Type: Current ) ------------------------------------------------------------------------- Hit ENTER to use Default Tag/Location: (Current)
This provides a default HISTORY.rc (output specification) file. The initial default will be the tag of the build in which you are running gcm_setup
. The idea is that you can save a custom HISTORY.rc
to the repository and have it checked out for your experiments.
Enter Desired Location for HOME Directory (to contain scripts and RC files) Hit ENTER to use Default Location: ---------------------------------- Default: /discover/nobackup/aeichman/myexp42
This option determines where the experiment's home directory is located -- where the basic job scripts and major RC files (AGCM.rc
, CAP.rc
and HISTORY.rc
) will be located. The first time you run the script it will default to a subdirectory under your account's home directory named geos5
, remember what you decide (in ~/.HOMDIRroot
) and use that as a default in subsequent times the script is run. This initial default is fine, though another possibility is to enter your nobackup space, as shown here. This will place all of the HOME directory files of the experiment together with the rest of them.
Enter Desired Location for EXPERIMENT Directory (to contain model output and restart files) Hit ENTER to use Default Location: ---------------------------------- Default: /discover/nobackup/aeichman/myexp42
This determines the experiment directory, where restart files and various job output is stored. These are the storage-intensive parts and so default to the nobackup
space.
Enter Location for Build directory containing: src/ Linux/ etc... Hit ENTER to use Default Location: ---------------------------------- Default: /discover/nobackup/aeichman/Fortuna-2_5_p6
This determines which of your local builds is used to create the experiment. It defaults to the build of the script you are running, which is generally a good idea.
Current GROUPS: g0620 Enter your GROUP ID for Current EXP: (Default: g0620)
This is used for by the job accounting system. If you are not in the default group, you will probably have been informed.
sending incremental file list GEOSgcm.x sent 50848492 bytes received 31 bytes 33899015.33 bytes/sec total size is 50842191 speedup is 1.00 Creating gcm_run.j for Experiment myexp42 ... Creating gcm_post.j for Experiment myexp42 ... Creating gcm_plot.tmpl for Experiment myexp42 ... Creating gcm_archive.j for Experiment myexp42 ... Creating gcm_regress.j for Experiment myexp42 ... Creating AGCM.rc for Experiment myexp42 ... Creating CAP.rc for Experiment myexp42 ... Creating HISTORY.rc for Experiment myexp42 ... Done! ----- Build Directory: /discover/nobackup/aeichman/Fortuna-2_5_p6 ---------------- The following executable has been placed in your Experiment Directory: ---------------------------------------------------------------------- /discover/nobackup/aeichman/Fortuna-2_5_p6/Linux/bin/GEOSgcm.x You must now copy your Initial Conditions into: ----------------------------------------------- /discover/nobackup/aeichman/myexp42 discover28: /discover/nobackup/aeichman/Fortuna-2_5_p6/src/Applications/GEOSgcm_App >
And the experiment is set up. After you copy initial condition files (aka restarts) to the experiment directory, you can submit your job.
Do not copy old experiments
When creating related experiments, you will be tempted to copy the experiment directory tree of an older experiment. Do not copy old experiments, run gcm_setup
instead. There are numerous instances where an experiment-specific directory is used in the run scripts created from templates by gcm_setup
and they will wreak subtle and pervasive havoc if executed in an unexpected environment. This warning is especially true between model versions. A useful and relatively safe exception to this rule is to copy previously used examples of HISTORY.rc
. However, you need to change the lines labeled EXPID
and EXPDSC
to the values in your automatically-generated HISTORY.rc
or the plotting will fail.
Using restart files
Restart files provide the initial conditions for a run, and a set needs to be copied into a fresh experiment directory before running. This includes the file cap_restart
, which provides the model starting date and time in text. Restart files themselves are resolution-specific and sometimes change between model versions. As of the current model version, they are flat binary files with no metadata, so they tend to be stored together with restarts of the same provinance with the date either embedded in the filename or in an accompanying cap_restart
, typically under a directory indicating the model version.
A cleanly completed model run will leave a set of restarts and the corresponding cap_restart
in its experiment directory. Another source is /archive/u/aeichman/restarts
. Restarts are also left during runs in date-labeled tarballs in the restarts
directory under the experiment directory before being transferred to the user's /archive
space. You may have to create the cap_restart
, which is simply one line of text with the date of the restart files in the format YYYYMMDD HHMMSS (with a space).
Failing the above sources, you can convert restarts from different resolutions and model versions, including MERRA, as described in Regridding Restarts for Fortuna 2.5.
What Happens During a Run
When the script gcm_run.j
starts running, it creates a directory called scratch
and copies or links into it the model executable, rc files, restarts and boundary conditions necessary to run the model. It also creates a directory for each of the output collections (in the default setup with the suffix geosgcm_
) in the directory holding
for before post-processing, and in the experiment directory for after post-processing. It also tars the restarts and moves the tarball to the restarts
directory.
Then the executable GEOSgcm.x
is run in the scratch
directory, starting with the date in cap_restart
and running for the length of a segment. A segment is the length of model time that the model integrates before returning, letting gcm_run.j
do some housekeeping and then running another segment. A model job will typically run a number of segments before trying to resubmit itself, hopefully before the allotted wallclock time of the job runs out.
The processing that the various batch jobs perform is illustrated below:
Each time a segment ends, gcm_run.j
submits a post-processing job before starting a new segment or exiting. The post-processing job moves the model output from the scratch
directory to the respective collection directory under holding
. Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job. The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in archive
filesystem. The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue. The plotting script produces a number of pre-determined plots as .gif
files in the plot_CLIM
directory in the experiment directory.
As explained above, the contents of the cap_restart
file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output. The end time may be set in CAP.rc
with the property END_DATE
(format YYYYMMDD HHMMSS, with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script gcm_run.j
so that it is not resubmitted to the job queue.
Tuning a run
Most of the other properties in CAP.rc
are discussed elsewhere, but two that are important for understanding how the batch jobs work are JOB_SGMT
, the length of the segment, and NUM_SGMT
, the number of segments that the job tries to run before resubmitting itself and exiting. JOB_SGMT
is in the format of YYYYMMDD HHMMSS (but usually expressed in days) and NUM_SGMT
as an integer, so the multiple of the two is the total model time that a job will attempt to run. It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota. So to tune for the maximum number of segments in a job, it is usually best to manipulate JOB_SGMT
.