GEOS GCM Quick Start: Difference between revisions

From GEOS-5
Jump to navigation Jump to search
Remove reference to sles 11
Add email address
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page describes the minimum steps required to build and run GEOS GCM on NCCS discover and NAS pleiades.  '''You should successfully complete the steps in these instructions before doing anything more complicated.  Also, it is helpful to read this page in its entirety before starting.'''   
This page describes the minimum steps required to build and run GEOS GCM on NCCS discover and NAS pleiades.  '''You should successfully complete the steps in these instructions before doing anything more complicated.  Also, it is helpful to read this page in its entirety before starting.'''   
If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov


'''Back to [[Documentation for GEOS GCM v10]]'''
'''Back to [[Documentation for GEOS GCM v10]]'''
Line 31: Line 33:
  module load GEOSenv
  module load GEOSenv


which obtains the latest <code>git</code>, <code>CMake</code>, and <code>manage_externals</code> modules.
which obtains the latest <code>git</code>, <code>CMake</code>, and <code>mepo</code> modules.


== Cloning the Model ==
== Cloning the Model ==
Line 39: Line 41:
You can then clone the model with:
You can then clone the model with:


  git clone -b v10.12.4 git@github.com:GEOS-ESM/GEOSgcm.git
  git clone -b v10.17.0 git@github.com:GEOS-ESM/GEOSgcm.git


where <code>-b v10.12.4</code> refers to a release tag of GEOS GCM. Information on the various releases can be found on the [https://github.com/GEOS-ESM/GEOSgcm/releases Releases page].
where <code>-b v10.17.0</code> refers to a release tag of GEOS GCM. Information on the various releases can be found on the [https://github.com/GEOS-ESM/GEOSgcm/releases Releases page].


=== HTTPS Access ===
=== HTTPS Access ===
Line 47: Line 49:
GEOS can also be cloned via https with:
GEOS can also be cloned via https with:


   git clone -b v10.12.4 https://github.com/GEOS-ESM/GEOSgcm.git
   git clone -b v10.17.0 https://github.com/GEOS-ESM/GEOSgcm.git
 
But if you do this you *should* use the Mepo method of building GEOS below. This is due to restrictions in how <code>checkout_externals</code> works (it refers to the sub-repositories of GEOS with SSH urls).


== Building GEOS ==
== Building GEOS ==
Line 59: Line 59:
==== Develop Version of GEOS GCM ====
==== Develop Version of GEOS GCM ====


The user will notice two files in the main directory: <code>Externals.cfg</code> and <code>Develop.cfg</code>. The difference between these two is that <code>Externals.cfg</code> always refers to stable tested released subrepositories. The <code>Develop.cfg</code> points to the <code>develop</code> branch of <code>@GEOSgcm_GridComp</code> and <code>@GEOSgcm_App</code>. This is equivalent in the CVS days of the difference between a stable <code>Jason-X_Y</code> tag and the development <code>Jason-UNSTABLE</code> tag. In order to build the <code>Develop.cfg</code> version of the model with <code>parallel_build.csh</code> do:
<code>parallel_build.csh</code> provides a special flag for checking out the development branches of GEOSgcm_GridComp and GEOSgcm_App. If you run:


parallel_build.csh -develop
<pre>parallel_build.csh -develop</pre>
then <code>mepo</code> will run:


<pre>mepo develop GEOSgcm_GridComp GEOSgcm_App</pre>
==== Debug Version of GEOS GCM ====
==== Debug Version of GEOS GCM ====


To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.
To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.


==== Mepo Version of GEOS GCM ====
==== Debug Version of GEOS GCM ====


GEOS GCM will soon be transitioning from using <code>checkout_externals</code> to using [https://github.com/GEOS-ESM/mepo <code>mepo</code>], a GMAO-developed multi-repository management tool. If you wish to use it via <code>parallel_build.csh</code> you can run:
To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.
 
parallel_build.csh -mepo
 
along with any other flags you usually use (<code>-develop</code> and <code>-debug</code>).


=== Multiple Steps for Building the Model ===
=== Multiple Steps for Building the Model ===


The steps detailed below are essentially those that <code>parallel_build.csh</code> performs for you. Either method should yield identical builds.
The steps detailed below are essentially those that <code>parallel_build.csh</code> performs for you. Either method should yield identical builds.
==== Checkout externals ====
Using the <code>checkout_externals</code> command to compose the model is done by:
cd GEOSgcm
checkout_externals
====== Checking out develop ======
To use the <code>Develop.cfg</code> file, run:
checkout_externals -e Develop.cfg


==== Mepo ====
==== Mepo ====


To checkout the full model with the [https://github.com/GEOS-ESM/mepo <code>mepo</code>] tool, you run:
The GEOS GCM is comprised of a set of sub-repositories. These are managed by a tool called [https://github.com/GEOS-ESM/mepo mepo]. To clone all the sub-repos, you can run <code>mepo clone</code> inside the fixture:
 
mepo init
mepo clone


<pre>cd GEOSgcm
mepo clone</pre>
The first command initializes the multi-repository and the second one clones and assembles all the sub-repositories according to <code>components.yaml</code>
The first command initializes the multi-repository and the second one clones and assembles all the sub-repositories according to <code>components.yaml</code>


===== Checking out develop =====
==== Checking out develop branches of GEOSgcm_GridComp and GEOSgcm_App ====
 
To get development branches of GEOS GCM with <code>mepo</code> is different. <code>mepo</code> itself knows (via <code>components.yaml</code>) what the development branch of each subrepository is. The equivalent of <code>Develop.cfg</code> for <code>mepo</code> is to checkout the development branches of GEOSgcm_GridComp and GEOSgcm_App:
 
mepo develop GEOSgcm_GridComp GEOSgcm_App


This must be done after <code>mepo clone</code> as it is running a git command in each sub-repository.
To get development branches of GEOSgcm_GridComp and GEOSgcm_App (a la the <code>-develop</code> flag for <code>parallel_build.csh</code>, one needs to run the equivalent <code>mepo</code> command. As mepo itself knows (via <code>components.yaml</code>) what the development branch of each subrepository is, the equivalent of <code>-develop</code> for <code>mepo</code> is to checkout the development branches of GEOSgcm_GridComp and GEOSgcm_App:


<pre>mepo develop GEOSgcm_GridComp GEOSgcm_App</pre>
This must be done ''after'' <code>mepo clone</code> as it is running a git command in each sub-repository.
==== Build the Model ====
==== Build the Model ====


Line 448: Line 430:
'''Back to [[Documentation for GEOS GCM v10]]'''
'''Back to [[Documentation for GEOS GCM v10]]'''


Contact Matthew Thompson at GMAO with questions and comments
If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov

Latest revision as of 08:51, 13 October 2022

This page describes the minimum steps required to build and run GEOS GCM on NCCS discover and NAS pleiades. You should successfully complete the steps in these instructions before doing anything more complicated. Also, it is helpful to read this page in its entirety before starting.

If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov

Back to Documentation for GEOS GCM v10

How to build GEOS GCM

Preliminary Steps

Load Build Modules

In your .bashrc or .tcshrc or other rc file add a line:

NCCS

module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12

NAS

module use -a /nobackup/gmao_SIteam/modulefiles

GMAO Desktops

On the GMAO desktops, the SI Team modulefiles should automatically be part of running module avail but if not, they are in:

module use -a /ford1/share/gmao_SIteam/modulefiles

Also do this in any interactive window you have. This allows you to get module files needed to correctly checkout and build the model.

Now load the GEOSenv module:

module load GEOSenv

which obtains the latest git, CMake, and mepo modules.

Cloning the Model

GEOS is now hosted on GitHub. The first thing to do is to create a GitHub account and add your SSH key to it.

You can then clone the model with:

git clone -b v10.17.0 git@github.com:GEOS-ESM/GEOSgcm.git

where -b v10.17.0 refers to a release tag of GEOS GCM. Information on the various releases can be found on the Releases page.

HTTPS Access

GEOS can also be cloned via https with:

 git clone -b v10.17.0 https://github.com/GEOS-ESM/GEOSgcm.git

Building GEOS

Single Step Building of the Model

If all you wish is to build the model, you can run parallel_build.csh from a head node. Doing so will checkout all the external repositories of the model and build it. When done, the resulting model build will be found in build/ and the installation will be found in install/ with setup scripts like gcm_setup and fvsetup in install/bin.

Develop Version of GEOS GCM

parallel_build.csh provides a special flag for checking out the development branches of GEOSgcm_GridComp and GEOSgcm_App. If you run:

parallel_build.csh -develop

then mepo will run:

mepo develop GEOSgcm_GridComp GEOSgcm_App

Debug Version of GEOS GCM

To obtain a debug version, you can run parallel_build.csh -debug which will build with debugging flags. This will build in build-Debug/ and install into install-Debug/.

Debug Version of GEOS GCM

To obtain a debug version, you can run parallel_build.csh -debug which will build with debugging flags. This will build in build-Debug/ and install into install-Debug/.

Multiple Steps for Building the Model

The steps detailed below are essentially those that parallel_build.csh performs for you. Either method should yield identical builds.

Mepo

The GEOS GCM is comprised of a set of sub-repositories. These are managed by a tool called mepo. To clone all the sub-repos, you can run mepo clone inside the fixture:

cd GEOSgcm
mepo clone

The first command initializes the multi-repository and the second one clones and assembles all the sub-repositories according to components.yaml

Checking out develop branches of GEOSgcm_GridComp and GEOSgcm_App

To get development branches of GEOSgcm_GridComp and GEOSgcm_App (a la the -develop flag for parallel_build.csh, one needs to run the equivalent mepo command. As mepo itself knows (via components.yaml) what the development branch of each subrepository is, the equivalent of -develop for mepo is to checkout the development branches of GEOSgcm_GridComp and GEOSgcm_App:

mepo develop GEOSgcm_GridComp GEOSgcm_App

This must be done after mepo clone as it is running a git command in each sub-repository.

Build the Model

Load Compiler, MPI Stack, and Baselibs

On tcsh:

source @env/g5_modules

or on bash:

source @env/g5_modules.sh
Create Build Directory

We currently do not allow in-source builds of GEOSgcm. So we must make a directory:

mkdir build

The advantages of this is that you can build both a Debug and Release version with the same clone if desired.

Run CMake

CMake generates the Makefiles needed to build the model.

cd build
cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install

This will install to a directory parallel to your build directory. If you prefer to install elsewhere change the path in:

-DCMAKE_INSTALL_PREFIX=<path>

and CMake will install there.

Build and Install with Make
make -j6 install

Running GEOS GCM

Passwordless Logins

First of all, to run jobs on the cluster you will need to set up passwordless ssh (which operates within the cluster, between the nodes running the job). To do so, run the following from your discover home directory:

 cat id_rsa.pub >>  authorized_keys

Similarly, transferring the daily output files (in monthly tarballs) requires passwordless authentication from discover to dirac. While in ~/.ssh on discover, run

 ssh-copy-id -i id_rsa.pub dirac

Then, log into dirac and cut and paste the contents of the id_rsa.pub file on discover into the ~/.ssh/authorized_keys file on dirac. Problems with ssh should be referred to NCCS support.

DSA Keys

Note: Due to evolution of security, it is recommended to not use DSA keys. NAS currently doesn't not allow them, and RSA and ED25519 keys are considered "better" anyway.

Setting up a model run

Once the model has built successfully, you will have an install/ directory in your checkout. To run gcm_setup go to the install/bin/ directory and run it there:

cd install/bin
./gcm_setup

The gcm_setup script asks you to provide an experiment name :

Enter the Experiment ID:

Your experiment name (later called EXPID) should be one word with no spaces, not starting with a numeral. Then the script will ask for a description:

Enter a 1-line Experiment Description:

Spaces are ok here. Next it will ask if you wish to CLONE an experiment. If yes, you can point this to another experiment and the setup script will try and duplicate all the RC, etc. files. For now, though, choose NO to create a new experiment.

Do you wish to CLONE an old experiment? (Default: NO or FALSE)

It will now ask you for the atmospheric model resolution, expecting the code for one of the displayed resolutions.

Enter the Atmospheric Horizontal Resolution code:
--------------------------------------
            Cubed-Sphere
--------------------------------------
   c48  --  2   deg
   c90  --  1   deg
   c180 -- 1/2  deg (56-km)
   c360 -- 1/4  deg (28-km)
   c720 -- 1/8  deg (14-km)
   c1440 - 1/16 deg ( 7-km)
             DYAMOND Grids
   c768 -- 1/8  deg (12-km)
   c1536 - 1/16 deg ( 6-km)
   c3072 - 1/32 deg ( 3-km)

For your first time out you will probably want to enter c48 (corresponding to ~2 degree resolution with the cubed sphere).

Next it will ask you about the vertical resolution:

Enter the Atmospheric Model Vertical Resolution: LM (Default: 72)

The next question is about using IOSERVER:

Do you wish to IOSERVER? (Default: NO or FALSE)

The "default" answer to this will change depending on the resolution you choose. For now, just accept the default.

Next is a question that asks what processor you wish to run on. For example, on discover at NCCS:

Enter the Processor Type you wish to run on:
   hasw (Haswell) (default)
   sky  (Skylake)

NOTE: At present you need access to special queues to use the Skylake, so choosing Haswell is usually a better option.

After this are questions involving the ocean model:

Do you wish to run the COUPLED Ocean/Sea-Ice Model? (Default: NO or FALSE)

Enter the Data_Ocean Horizontal Resolution code: o1 (1  -deg,  360x180  Reynolds) Default
                                                 o2 (1/4-deg, 1440x720  MERRA-2)
                                                 o3 (1/8-deg, 2880x1440 OSTIA)
                                                 CS (Cubed-Sphere OSTIA)

Then Land model:

Enter the choice of  Land Surface Boundary Conditions using: 1 (Default: Icarus), 2 (Latest Icarus-NL)

Then the aerosols:

Do you wish to run GOCART with Actual or Climatological Aerosols? (Enter: A (Default) or C)

Enter the GOCART Emission Files to use: MERRA2 (Default), PIESA, CMIP, NR, MERRA2-DD or OPS:

After this are some questions about various setups in the model. The default is often your best bet.

Enter the tag or directory (/filename) of the HISTORY.AGCM.rc.tmpl to use
(To use HISTORY.AGCM.rc.tmpl from current build, Type:  Current         )
-------------------------------------------------------------------------
Hit ENTER to use Default Tag/Location: (Current)

NOTE: Some/Many things are easier if your HOME and EXPERIMENT directories are the same. For the next two, look carefully at the default and make sure they are both pointing to the same nobackup location.

Enter Desired Location for the HOME Directory (to contain scripts and RC files)
Hit ENTER to use Default Location:
----------------------------------
Default:  ~USER/geos5/EXPID
/discover/nobackup/USER/EXPID 

Enter Desired Location for the EXPERIMENT Directory (to contain model output and restart files)
Hit ENTER to use Default Location:
----------------------------------
Default:  /discover/nobackup/USER/EXPID

Enter Location for Build directory containing:  src/ Linux/ etc...
Hit ENTER to use Default Location:
----------------------------------
Default:  /discover/nobackup/USER/GEOSgcm/install

After these it will ask you for a group ID -- the default for this writer is g0620 (GMAO modeling group). Enter whatever is appropriate, as necessary.

 Current GROUPS: g0620 gmaoint
Enter your GROUP ID for Current EXP: (Default: g0620)
-----------------------------------

The script will produce some messages and create an experiment directory (EXPDIR) in your space as /discover/nobackup/USERID/EXPID, which contains the files and sub-directories:

  • AGCM.rc -- resource file with specifications of boundary conditions, initial conditions, parameters, etc.
  • archive/ -- contains job script for archiving output
  • CAP.rc -- resource file with run job parameters
  • convert -- contains job script that converts restarts (initial condition files) from older model versions
  • ExtData.rc -- sample resource file for external data, not used
  • forecasts/ -- contains scripts used for data assimilation mode
  • fvcore_layout.rc --
  • gcm_run.j -- run script
  • GEOSgcm.x -- model executable
  • HISTORY.rc -- resource file specifying the fields in the model that are output as data
  • plot/ -- contains plotting job script template and .rc file
  • post/ -- contains the script template and .rc file for post-processing model output
  • RC/ -- contains resource files for various components of the model
  • regress/ -- contains scripts for doing regression testing of model
  • src -- directory with a tarball of the model version's source code

The post-processing script will generate the archiving and plotting scripts as it runs. The setup script that you ran also creates an experiment home directory (HOMDIR) as either in ~USERID/geos5/EXPID (if you accepted the default) or in /discover/nobackup/USERID/EXPID (if you followed the above advice) containing the run scripts and GEOS resource (.rc) files.

Running GEOS

Before running the model, there is some more setup to be completed. The run scripts need some environment variables set in ~/.cshrc (regardless of which login shell you use -- the GEOS scripts use csh). Here are the minimum contents of a .cshrc:

umask 022
unlimit
limit stacksize unlimited
set arch = `uname`

The umask 022 is not strictly necessary, but it will make the various files readable to others, which will facilitate data sharing and user support. Your home directory ~USERID is also inaccessible to others by default; running chmod 755 ~ is helpful.

Copy the restart (initial condition) files and associated cap_restart into EXPDIR. You can get an arbitrary set of restarts by copying the contents of the directory /discover/nobackup/mathomp4/Restarts-J10/nc4/Reynolds/c48, containing 2-degree cubed sphere restarts from April 14, 2000, and their corresponding cap_restart.

The script you submit, gcm_run.j, is in HOMEDIR. It should be ready to go as is. The parameter END_DATE in CAP.rc can be set to the date you want the run to stop. Submit the job with sbatch gcm_run.j. You can keep track of it with squeue or squeue -u USERID, or follow stdout with tail -f EXPDIR/slurm-JOBID.out, JOBID being returned by sbatch and displayed with squeue. Jobs can be killed with scancel JOBID.

Output and Plots

During a normal run, the gcm_run.j script will run the model for the segment length (current default is 15 days in model time). The model creates output files (with an nc4 extension), also called collections (of output variables), in EXPDIR/scratch directory. After each segment, the script moves the output to the EXPDIR/holding and spawns a post-processing batch job which partitions and moves the output files within the holding directory to their own distinct collection directory, which is again partitioned into the appropriate year and month. The post processing script then checks to see if a full month of data is present. If not, the post-processing job ends. If there is a full month, the script will then run the time-averaging executable to produce a monthly mean file in EXPDIR/geosgcm_*. The post-processing script then spawns a new batch job which will archive the data onto the mass-storage drives (/archive/u/USERID/GEOS5.0/EXPID).

If a monthly average file was made, the post-processing script will also check to see if it should spawn a plot job. Currently, our criteria for plotting are: 1) if the month created was February or August, AND 2) there are at least 3 monthly average files, then a plotting job for the seasons DJF or JJA will be issued. The plots are created as gifs in EXPDIR/plots_CLIM.

The post-processing script can be found in: GEOSagcm/src/GMAO_Shared/GEOS_Util/post/gcmpost.script. The nc4 output files can be opened and plotted with grads -- see http://www.iges.org/grads/gadoc/tutorial.html for a tutorial, but use sdfopen instead of open.

The contents of the output files (including which variables get saved) may be configured in the HOMEDIR/HISTORY.rc -- a good description of this file may be found at http://modelingguru.nasa.gov/clearspace/docs/DOC-1190 .

What Happens During a Run

When the script gcm_run.j starts running, it creates a directory called scratch and copies or links into it the model executable, rc files, restarts and boundary conditions necessary to run the model. It also creates a directory for each of the output collections (in the default setup with the suffix geosgcm_) in the directory holding for before post-processing, and in the experiment directory for after post-processing. It also tars the restarts and moves the tarball to the restarts directory.

Then the executable GEOSgcm.x is run in the scratch directory, starting with the date in cap_restart and running for the length of a segment. A segment is the length of model time that the model integrates before returning, letting gcm_run.j do some housekeeping and then running another segment. A model job will typically run a number of segments before trying to resubmit itself, hopefully before the allotted wallclock time of the job runs out.

The processing that the various batch jobs perform is illustrated below:


Each time a segment ends, gcm_run.j submits a post-processing job before starting a new segment or exiting. The post-processing job moves the model output from the scratch directory to the respective collection directory under holding. Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job. The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in archive filesystem. The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue. The plotting script produces a number of pre-determined plots as .gif files in the plot_CLIM directory in the experiment directory.

You can check on jobs in the queue with qstat. The jobs associated with the run will be named with the experiment name appended with the type of job it is: RUN, POST, ARCH or PLT.

As explained above, the contents of the cap_restart file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output. The end time may be set in CAP.rc with the property END_DATE (format YYYYMMDD HHMMSS, with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script gcm_run.j so that it is not resubmitted to the job queue.

Tuning a run

Most of the other properties in CAP.rc are discussed elsewhere, but two that are important for understanding how the batch jobs work are JOB_SGMT, the length of the segment, and NUM_SGMT, the number of segments that the job tries to run before resubmitting itself and exiting. JOB_SGMT is in the format of YYYYMMDD HHMMSS (but usually expressed in days) and NUM_SGMT as an integer, so the multiple of the two is the total model time that a job will attempt to run. It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota. So to tune for the maximum number of segments in a job, it is usually best to manipulate JOB_SGMT.

Determining Output: HISTORY.rc

The contents of the the file HISTORY.rc (in your experiment HOME directory) tell the model what and how to output its state and diagnostic fields. The default HISTORY.rc provides many fields as is, but you may want to modify it to suit your needs.

File format

The top of a default HISTORY.rc will look something like this:

EXPID:  myexp42
EXPDSC: this_is_my_experiment
  
 
COLLECTIONS: 'geosgcm_prog'
             'geosgcm_surf'
             'geosgcm_moist'
             'geosgcm_turb'

[....]

The attribute EXPID must match the name of the experiment HOME directory; this is only an issue if you copy the HISTORY.rc from a different experiment. The EXPDSC attribute is used to label the plots. The COLLECTIONS attribute contains list of strings indicating the output collections to be created. The content of the individual collections are determined after this list. Individual collections can be "turned off" by commenting the relevant line with a #.

The following is an example of a collection specification:

  geosgcm_prog.template:  '%y4%m2%d2_%h2%n2z.nc4',
  geosgcm_prog.archive:   '%c/Y%y4',
  geosgcm_prog.format:    'CFIO',
  geosgcm_prog.frequency:  060000,
  geosgcm_prog.resolution: 144 91,
  geosgcm_prog.vscale:     100.0,
  geosgcm_prog.vunit:     'hPa',
  geosgcm_prog.vvars:     'log(PLE)' , 'DYN'          ,
  geosgcm_prog.levels:     1000 975 950 925 900 875 850 825 800 775 750 725 700 650 600 550 500 450 400 350 300 250 200 150 100 70 50 40 30 20 10 7 5 4 3 2 1 0.7 0.5 0.4 0.3 0.2
0.1 0.07 0.05 0.04 0.03 0.02,
  geosgcm_prog.fields:    'PHIS'     , 'AGCM'         ,
                          'T'        , 'DYN'          ,
                          'PS'       , 'DYN'          ,
                          'ZLE'      , 'DYN'          , 'H'   ,
                          'OMEGA'    , 'DYN'          ,
                          'Q'        , 'MOIST'        , 'QV'  ,
                          ::

The individual collection attributes are described below, but what users modify the most are the fields attribute. This determines which exports are saved in the collection. Each field record is a string with the name of an export from the model followed by a string with the name of the gridded component which exports it, separated by a comma. The entries with a third column determine the name by which that export in saved in the collection file when the name is different from that of the export.

There is a good description of available collection options at Modeling Guru: https://modelingguru.nasa.gov/docs/DOC-1190

What exports are available?

To add export fields to the HISTORY.rc you will need to know what fields the model provides, which gridded component provides them, and their name. The most straightforward way to do this is to use PRINTSPEC. The setting for PRINTSPEC is in the file CAP.rc. By default the line looks like so:

PRINTSPEC: 0  # (0: OFF, 1: IMPORT & EXPORT, 2: IMPORT, 3: EXPORT)

Setting PRINTSPEC to 3 will make the model send to standard output a list of exports available to HISTORY.rc in the model's current configuration, and then exit without integrating. The list includes each export's gridded component and short name (both necessary to include in HISTORY.rc), long (descriptive) name, units, and number of dimensions. Note that run-time options can affect the exports available, so see to it that you have those set as you intend. The other PRINTSPEC values are useful for debugging.

While you can set PRINTSPEC, submit sbatch gcm_run.j, and get the export list as part of PBS standard output, there are quicker ways of obtaining the list. One way is to run it as a single column model on a single processor, as explained in Jason Single Column Model. Another way is to run it in an existing experiment. In the scratch directory of an experiment that has already run, change PRINTSPEC in CAP.rc as above. Then, in the file AGCM.rc, change the values of NX and NY (near the beginning of the file) to 1. Then, from an interactive job (one processor will suffice), run the executable GEOSgcm.x in scratch. You will need to run source src/g5_modules in the model's build tree to set up the environment. The model executable will simply output the export list to stdout.

Outputting Derived Fields

In addition to writing export fields created by model components (we will refer to these as model fields), the user may specify new fields that will be evaluated using the MAPL parser. These will be referred to as derived fields in the following discussion. The derived fields are evaluated using an expression that involves other fields in the collection as variables. The expression is evaluated element by element to create a new field. Derived fields are specified like a regular field from a gridded component in a history collection with 3 comma separated strings. The difference is now that in place of a variable name string, an expression string that will be evaluated is entered. Following this comes the string specifying the gridded component. You MUST put a string here, which should be the name of a gridded component. Finally a string MUST be entered which is the name of the new variable. This will be the name of the variable in the output file. In general the expression entered will involve variables, functions, and real numbers. The derived fields are evaluated before time and spatial (vertical and horizontal) averaging.

Here are some rules about expressions

  1. Fields in expression can only be model fields.
  2. If the model field has an alias you must use the alias in the expression.
  3. You can not mix center and edge fields in an expression. You can mix 2D and 3D fields if the 3D fields are all center or edge. In this case each level of the 3D field operated with the 2D field. Another way to think of this is that in an expression involving a 2D and 3D field the 2D field gets promoted to a 3D field with the same data in each level.
  4. When parsing an expression the parser first checks if the fields in an expression are part of the collection. Any model field in a collection can be used in an expression in the same collection. However, there might be cases where you wish to output an expression but not the model fields used in the expression. In this case if the parser does not find the field in the collection it checks the gridded component name after the expression for the model field. If the field is found in the gridded component it can use it in the expression. Note that if you have an expression with two model fields from different gridded components you can not use this mechanism to output the expression without outputting either field. One of them must be in the collection.
  5. The alias of an expression can not be used in a subsequent expression.

Here are the rules for the expressions themselves The following can appear in the expression string

  1. The function string can contain the following mathematical operators +, -, *, /, ^ and ()
  2. Variable names - Parsing of variable names is case sensitive.
  3. The following single argument fortran intrinsic functions and user defined functions are implmented: exp, log10, log, sqrt, sinh, cosh, tanh, sin, cos, tan, asin, acos, atan, heav (the Heaviside step function). Parsing of functions is case insensitive.
  4. Integers or real constants. To be recognized as explicit constants these must conform to the format [+|-][nnn][.nnn][e|E|d|D][+|-][nnn] where nnn means any number of digits. The mantissa must contain at least one digit before or following an optional decimal point. Valid exponent identifiers are 'e', 'E', 'd' or 'D'. If they appear they must be followed by a valid exponent!
  5. Operations are evaluated in the order
    1. expressions in brackets
    2. -X unary minux
    3. X^Y exponentiation
    4. X*Y X/Y multiplicaiton and division
    5. A+B X-Y addition and subtraction

In the following example we create a collection that has three derived fields, the magnitude of the wind, the temperature in farenheit, and temperature cubed:

  geosgcm_prog.template:  '%y4%m2%d2_%h2%n2z.nc4',
  geosgcm_prog.archive:   '%c/Y%y4',
  geosgcm_prog.format:    'CFIO',
  geosgcm_prog.frequency:  060000,
  geosgcm_prog.resolution: 144 91,
  geosgcm_prog.fields:    'U'             , 'DYN'          ,
                          'V'             , 'DYN'          ,
                          'T'             , 'DYN'          ,
                          'sqrt(U*U+V*V)' , 'DYN'          , 'Wind_Magnitude'   ,
                          '(T-273.15)*1.8+32.0' , 'DYN'    , 'TF' ,
                          'T^3'           , 'DYN',         'T3' ,
                          ::



Back to Documentation for GEOS GCM v10

If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov