GEOS GCM Quick Start: Difference between revisions

(12 intermediate revisions by the same user not shown)

Line 1:

This page describes the minimum steps required to build and run GEOS GCM on NCCS discover and NAS pleiades. '''You should successfully complete the steps in these instructions before doing anything more complicated. Also, it is helpful to read this page in its entirety before starting.'''

If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov

'''Back to [[Documentation for GEOS GCM v10]]'''

== How to ~~Obtain~~ GEOS GCM ~~and Compile Source Code~~ ==

= How to build GEOS GCM =

== Preliminary Steps ==

=== Load Build Modules ===

In your <code>.bashrc</code> or <code>.tcshrc</code> or other rc file add a line:

==== NCCS ====

module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12

There are two options for obtaining the model source code: from the CVS repository on the NCCS progress server, and from the SVN "public" repository on the trac server. Since the code on progress is more current, elgible users are strongly encouraged to obtain accounts from NCCS and use the progress repository.

==== NAS ====

~~=== Using the NCCS CVS code repository ===~~

module use -a /nobackup/gmao_SIteam/modulefiles

The following assumes that you know your way around Unix, have successfully logged into your cluster account and have an account on the source code repository with the proper <code>ssh</code> configuration -- see the NCCS repository quick start pages at: https://www.nccs.nasa.gov/trac/admin/wiki/QuickStart. The link requires your NCCS username and password. The recommend SSH config setup for CVS on discover is:

==== GMAO Desktops ====

~~Host cvsacldirect~~

On the GMAO desktops, the SI Team modulefiles should automatically be part of running <code>module avail</code> but if not, they are in:

~~HostName cvsacl.nccs.nasa.gov~~

~~Port 22223~~

That's it. Progress is not needed unless you specifically know you need it. It won't hurt to add it, but at present it isn't needed. Also, you'll need to generate RSA or ED25519 keys and upload them (this is mentioned in the quick start page above) to https://www.nccs.nasa.gov/~~keyupload~~/~~. (NOTE: DSA keys are not recommended as some sites, e.g., NAS, have started disallowing them.) The usual way of doing this is to go to your <tt>.ssh<~~/~~tt> directory and run <tt>ssh-keygen<~~/~~tt>, making a key with no password:~~

module use -a /ford1/share/gmao_SIteam/modulefiles

~~$ cd $HOME/~~.~~ssh~~

Also do this in any interactive window you have. This allows you to get module files needed to correctly checkout and build the model.

~~$ ssh-keygen -o -a 100 -b 3072 -t rsa~~

~~The commands below assume that your shell is <code>csh</code>. Since the scripts to build and run GEOS tend to be written in~~ the ~~same, you shouldn't bother trying to import too much into an alternative shell. If you prefer a different shell, it is easiest just to open a~~ <code>~~csh~~</code> ~~process to build the model and your experiment.~~

Now load the <code>GEOSenv</code> module:

~~Furthermore, model builds should be created in your space under <code>/discover/nobackup</code>, as creating them under your home directory will quickly wipe out your disk quota.~~

module load GEOSenv

~~Set~~ the ~~following three environment variables:~~

which obtains the latest <code>git</code>, <code>CMake</code>, and <code>mepo</code> modules.

~~setenv CVS_RSH ssh~~

== Cloning the Model ==

~~setenv CVSROOT :ext:''USERID''@cvsacldirect:/cvsroot/esma~~

~~where ''USERID''~~ is~~, of course,~~ your ~~repository username, which should be the same as~~ your ~~NASA and NCCS username~~. ~~Then, issue the command:~~

GEOS is now hosted on GitHub. The first thing to do is to [https://github.com/join create a GitHub account] and [https://help.github.com/en/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account add your SSH key] to it.

~~cvs co -r Jason-3_0 GEOSagcm~~

You can then clone the model with:

~~This should check out the latest stable version of the model from the repository and create a directory called <code>GEOSagcm<~~/~~code>~~.

git clone -b v10.17.0 git@github.com:GEOS-ESM/GEOSgcm.git

~~==== CVS Errors ====~~

where <code>-b v10.17.0</code> refers to a release tag of GEOS GCM. Information on the various releases can be found on the [https://github.com/GEOS-ESM/GEOSgcm/releases Releases page].

~~If the CVS checkout doesn't work for you, there are many possibilities.~~

=== HTTPS Access ===

~~===== Keyupload =====~~

GEOS can also be cloned via https with:

~~If the error says something about "keyupload", then the key upload either hasn't taken hold yet (can take 5~~-~~10 minutes to work after uploading the key), or, perhaps, the wrong key was uploaded~~.

git clone -b v10.17.0 https://github.com/GEOS-ESM/GEOSgcm.git

==~~=== Access denied for this host ===~~==

== Building GEOS ==

~~If you see something like "access denied for this host", then your best bet is to contact NCCS. Per a response from NCCS to a user that had something similar happen, they need to add~~ the ~~CVS hosts to an LDAP entry.~~

=== Single Step Building of the Model ===

~~===== Failed~~ to ~~create lock~~/~~permission denied =====~~

If all you wish is to build the model, you can run <code>parallel_build.csh</code> from a head node. Doing so will checkout all the external repositories of the model and build it. When done, the resulting model build will be found in <code>build/</code> and the installation will be found in <code>install/</code> with setup scripts like <code>gcm_setup</code> and <code>fvsetup</code> in <code>install/bin</code>.

~~If you see something like:~~

==== Develop Version of GEOS GCM ====

~~cvs checkout: failed to create lock directory for `/cvsroot/esma/CVSROOT' (/cvsroot/esma/CVSROOT/#cvs.history~~.~~lock): Permission denied~~

<code>parallel_build.csh</code> provides a special flag for checking out the development branches of GEOSgcm_GridComp and GEOSgcm_App. If you run:

~~cvs checkout: failed to obtain history lock in repository `~~/~~cvsroot/esma'~~

~~cvs checkout: Updating src~~

~~cvs checkout: failed to create lock directory~~ for ~~`/cvsroot/esma/esma/src/Applications/GEOSdas' (/cvsroot/esma/esma/src/Applications/GEOSdas/#cvs~~.~~lock): Permission denied~~

~~cvs checkout: failed to obtain dir lock in repository `/cvsroot/esma/esma/src/Applications/GEOSdas'~~

~~cvs [checkout aborted]~~: ~~read lock failed - giving up~~

~~this means you don't have a home directory on progress~~. ~~Try doing~~:

<pre>parallel_build.csh -develop</pre>

then <code>mepo</code> will run:

~~$ ssh progress.nccs.nasa.gov~~

<pre>mepo develop GEOSgcm_GridComp GEOSgcm_App</pre>

==== Debug Version of GEOS GCM ====

~~You'll enter your PASSCODE and password and then it'll seem like the terminal is "stuck"~~. ~~Just hit Ctrl~~-C. ~~Now try the CVS command again~~.

To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.

=== ~~Compiling the Model~~ ===

==== Debug Version of GEOS GCM ====

~~First~~, you ~~need to set~~ <code>~~ESMADIR~~</code>. ~~For example, if your~~ <code>~~src~~/</code> ~~directory is:~~

To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.

~~/discover/nobackup/mathomp4/Models/Jason-3_0/GEOSagcm/src~~

=== Multiple Steps for Building the Model ===

~~then~~ you should ~~set:~~

The steps detailed below are essentially those that <code>parallel_build.csh</code> performs for you. Either method should yield identical builds.

~~setenv ESMADIR /discover/nobackup/mathomp4/Models/Jason-3_0/GEOSagcm~~

==== Mepo ====

~~Next~~, ~~we need to source~~ <code>~~g5_modules~~</code> ~~with~~:

The GEOS GCM is comprised of a set of sub-repositories. These are managed by a tool called [https://github.com/GEOS-ESM/mepo mepo]. To clone all the sub-repos, you can run <code>mepo clone</code> inside the fixture:

~~source $ESMADIR~~/~~src~~/~~g5_modules~~

<pre>cd GEOSgcm

mepo clone</pre>

The first command initializes the multi-repository and the second one clones and assembles all the sub-repositories according to <code>components.yaml</code>

~~This will set up the build environment. If you then type~~

==== Checking out develop branches of GEOSgcm_GridComp and GEOSgcm_App ====

~~module list~~

To get development branches of GEOSgcm_GridComp and GEOSgcm_App (a la the <code>-develop</code> flag for <code>parallel_build.csh</code>, one needs to run the equivalent <code>mepo</code> command. As mepo itself knows (via <code>components.yaml</code>) what the development branch of each subrepository is, the equivalent of <code>-develop</code> for <code>mepo</code> is to checkout the development branches of GEOSgcm_GridComp and GEOSgcm_App:

~~you should see:~~

<pre>mepo develop GEOSgcm_GridComp GEOSgcm_App</pre>

This must be done ''after'' <code>mepo clone</code> as it is running a git command in each sub-repository.

==== Build the Model ====

~~Currently Loaded Modulefiles:~~

===== Load Compiler, MPI Stack, and Baselibs =====

~~1) other/comp/gcc-6.3~~

~~2) comp/intel-18.0.1.163~~

~~3) mpi/sgi-mpt-2.17~~

~~4) lib/mkl-18.0.1.163~~

~~5) other/SIVO-PyD/spd_1.25.0_gcc-6.3_mkl-17.0.4.196~~

~~If this all worked, then type~~:

On tcsh:

~~cd $ESMADIR~~/~~src~~

<pre>source @env/g5_modules

~~gmake install~~

</pre>

or on bash:

~~This will build the model. It will take about 30 minutes. If this works, it should create a directory under <code>GEOSagcm</code> called~~ <~~code~~>~~Linux/bin<~~/~~code>~~. ~~In here you should find the executable: <code>GEOSgcm.x~~</~~code~~> .

<pre>source @env/g5_modules.sh

</pre>

===== Create Build Directory =====

~~== Setting up~~ a ~~Run ==~~

We currently do not allow in-source builds of GEOSgcm. So we must make a directory:

=== Passwordless Logins ===

<pre>mkdir build

</pre>

The advantages of this is that you can build both a Debug and Release version with the same clone if desired.

===== Run CMake =====

CMake generates the Makefiles needed to build the model.

<pre>cd build

cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install

</pre>

This will install to a directory parallel to your <code>build</code> directory. If you prefer to install elsewhere change the path in:

-DCMAKE_INSTALL_PREFIX=<path>

</pre>

and CMake will install there.

===== Build and Install with Make =====

<pre>make -j6 install

</pre>

= Running GEOS GCM =

== Passwordless Logins ==

First of all, to run jobs on the cluster you will need to set up passwordless <code>ssh</code> (which operates within the cluster, between the nodes running the job). To do so, run the following from your '''discover''' home directory:

Line 112:

Line 143:

Then, log into '''dirac''' and cut and paste the contents of the <code>id_rsa.pub</code> file on '''discover''' into the <code>~/.ssh/authorized_keys</code> file on '''dirac'''. Problems with <code>ssh</code> should be referred to NCCS support.

==== DSA Keys ====

=== DSA Keys ===

Note: Due to evolution of security, it is recommended to not use DSA keys. NAS currently doesn't not allow them, and RSA and ED25519 keys are considered "better" anyway.

=== Setting up a model run ===

== Setting up a model run ==

~~To set~~ the model ~~up to~~ run~~, cd~~ to <code>~~GEOSagcm/src~~/~~Applications~~/~~GEOSgcm_App~~</code> and run:

Once the model has built successfully, you will have an <code>install/</code> directory in your checkout. To run <code>gcm_setup</code> go to the <code>install/bin/</code> directory and run it there:

cd install/bin

./gcm_setup

Line 130:

Line 162:

Enter a 1-line Experiment Description:

Spaces are ok here. It will ~~then~~ ask ~~for the source code version tag~~ to ~~associate with the model --~~ you ~~should hit enter for~~ the ~~default:~~

Spaces are ok here. Next it will ask if you wish to CLONE an experiment. If yes, you can point this to another experiment and the setup script will try and duplicate all the RC, etc. files. For now, though, choose NO to create a new experiment.

~~Enter an Experiment Source Tag for History (Default: Jason-3_0):~~

~~Hit enter for~~ the ~~default~~ to ~~the next question:~~

Do you wish to CLONE an old experiment? (Default: NO or FALSE)

It will ~~also~~ ask you for the atmospheric model resolution, expecting the code for one of the displayed resolutions.

It will now ask you for the atmospheric model resolution, expecting the code for one of the displayed resolutions.

Enter the Atmospheric Horizontal Resolution code:

<nowiki>Enter the Atmospheric Horizontal Resolution code:

~~---------------------~~--------------------------------------

--------------------------------------

~~Lat/Lon~~ Cubed-Sphere

Cubed-Sphere

~~---------------------~~--------------------------------------

--------------------------------------

~~b -- 2 deg~~ c48 -- 2 deg

c48 -- 2 deg

~~c -- 1 deg~~ c90 -- 1 deg

c90 -- 1 deg

d -- 1/2 deg ~~c180~~ -- 1/2 deg (56-km)

c180 -- 1/2 deg (56-km)

e -- 1/4 deg (35-km) ~~c360 -~~- 1/4 deg (28-km)

c360 -- 1/4 deg (28-km)

~~c720~~ -- 1/8 deg (14-km)

c720 -- 1/8 deg (14-km)

~~c1440~~ - 1/16 deg ( 7-km)

c1440 - 1/16 deg ( 7-km)

DYAMOND Grids

c768 -- 1/8 deg (12-km)

c1536 - 1/16 deg ( 6-km)

c3072 - 1/32 deg ( 3-km)</nowiki>

For your first time out you will probably want to enter <code>c48</code> (corresponding to ~2 degree resolution with the cubed sphere). ~~On the next eight questions, hitting enter to accept the default~~ will ~~let~~ you ~~run a PChem run~~:

For your first time out you will probably want to enter <code>c48</code> (corresponding to ~2 degree resolution with the cubed sphere).

Next it will ask you about the vertical resolution:

Enter the Atmospheric Model Vertical Resolution: LM (Default: 72)

The next question is about using IOSERVER:

Do you wish to IOSERVER? (Default: NO or FALSE)

The "default" answer to this will change depending on the resolution you choose. For now, just accept the default.

Next is a question that asks what processor you wish to run on. For example, on discover at NCCS:

Enter the Processor Type you wish to run on:

hasw (Haswell) (default)

sky (Skylake)

NOTE: At present you need access to special queues to use the Skylake, so choosing Haswell is usually a better option.

After this are questions involving the ocean model:

Do you wish to run the COUPLED Ocean/Sea-Ice Model? (Default: NO or FALSE)

Line 160:

Line 210:

o2 (1/4-deg, 1440x720 MERRA-2)

o3 (1/8-deg, 2880x1440 OSTIA)

CS (Cubed-Sphere OSTIA)

~~Do you wish to run GOCART with Actual or Climatological Aerosols? (~~Enter: A (Default) ~~or C~~)

Then Land model:

Enter the choice of Land Surface Boundary Conditions using: 1 (Default: Icarus), 2 (Latest Icarus-NL)

Then the aerosols:

Do you wish to run GOCART with Actual or Climatological Aerosols? (Enter: A (Default) or C)

Enter the GOCART Emission Files to use: MERRA2 (Default), PIESA, CMIP, NR, MERRA2-DD or OPS:

After this are some questions about various setups in the model. The default is often your best bet.

Enter the tag or directory (/filename) of the HISTORY.AGCM.rc.tmpl to use

Line 176:

Line 234:

Hit ENTER to use Default Location:

----------------------------------

Default: /discover/nobackup/''USER''/''EXPID''

Default: ~''USER''/geos5/''EXPID''

/discover/nobackup/''USER''/''EXPID''

Enter Desired Location for the EXPERIMENT Directory (to contain model output and restart files)

Hit ENTER to use Default Location:

----------------------------------

Default: ~~~''USER''/geos5/''EXPID''~~

Default: /discover/nobackup/''USER''/''EXPID''

/discover/nobackup/''USER''/''EXPID''

Enter Location for Build directory containing: src/ Linux/ etc...

Hit ENTER to use Default Location:

----------------------------------

Default: /discover/nobackup/''USER''/~~GEOSagcm~~

Default: /discover/nobackup/''USER''/GEOSgcm/install

After these it will ask you for a group ID -- the default for this writer is g0620 (GMAO modeling group). Enter whatever is appropriate, as necessary.

Line 194:

Line 252:

Enter your GROUP ID for Current EXP: (Default: g0620)

-----------------------------------

The script will produce some messages and create an experiment directory (''EXPDIR'') in your space as <code>/discover/nobackup/''USERID''/''EXPID''</code>, which contains the files and sub-directories:

Line 213:

Line 270:

*<code>regress/</code> -- contains scripts for doing regression testing of model

*<code>src</code> -- directory with a tarball of the model version's source code

The post-processing script will generate the archiving and plotting scripts as it runs. The setup script that you ran also creates an experiment home directory (''HOMDIR'') as either in <code>~''USERID''/geos5/''EXPID''</code> (if you accepted the default) or in <code>/discover/nobackup/''USERID''/''EXPID''</code> (if you followed the above advice) containing the run scripts and GEOS resource (<code>.rc</code>) files.

== Running GEOS-5 ==

== Running GEOS ==

Before running the model, there is some more setup to be completed. The run scripts need some environment variables set in <code>~/.cshrc</code> (regardless of which login shell you use -- the GEOS-5 scripts use <code>csh</code>). Here are the minimum contents of a <code>.cshrc</code>:

Before running the model, there is some more setup to be completed. The run scripts need some environment variables set in <code>~/.cshrc</code> (regardless of which login shell you use -- the GEOS scripts use <code>csh</code>). Here are the minimum contents of a <code>.cshrc</code>:

umask 022

Line 228:

Line 284:

The <code>umask 022</code> is not strictly necessary, but it will make the various files readable to others, which will facilitate data sharing and user support. Your home directory <code>~''USERID''</code> is also inaccessible to others by default; running <code>chmod 755 ~</code> is helpful.

Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''. You can get an arbitrary set of restarts by copying the contents of the directory <code>/discover/nobackup/mathomp4/Restarts-~~I30~~/nc4/Reynolds/c48</code>, containing 2-degree cubed sphere restarts from April 14, 2000, and their corresponding <code>cap_restart</code>.

Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''. You can get an arbitrary set of restarts by copying the contents of the directory <code>/discover/nobackup/mathomp4/Restarts-J10/nc4/Reynolds/c48</code>, containing 2-degree cubed sphere restarts from April 14, 2000, and their corresponding <code>cap_restart</code>.

The script you submit, <code>gcm_run.j</code>, is in ''HOMEDIR''. It should be ready to go as is. The parameter END_DATE in <code>CAP.rc</code> can be set to the date you want the run to stop. Submit the job with <code>sbatch gcm_run.j</code>. You can keep track of it with <code>squeue</code> or <code>squeue -u ''USERID''</code>, or follow stdout with <code>tail -f ''EXPDIR''/slurm-''JOBID''.out</code>, ''JOBID'' being returned by <code>sbatch</code> and displayed with <code>squeue</code>. Jobs can be killed with <code>scancel ''JOBID''</code>.

Line 247:

Line 303:

The contents of the output files (including which variables get saved) may be configured in the <code>''HOMEDIR''/HISTORY.rc</code> -- a good description of this file may be found at http://modelingguru.nasa.gov/clearspace/docs/DOC-1190 .

== What Happens During a Run ==

When the script <code>gcm_run.j</code> starts running, it creates a directory called <code>scratch</code> and copies or links into it the model executable, rc files, restarts and boundary conditions necessary to run the model. It also creates a directory for each of the output collections (in the default setup with the suffix <code>geosgcm_</code>) in the directory <code>holding</code> for before post-processing, and in the experiment directory for after post-processing. It also tars the restarts and moves the tarball to the <code>restarts</code> directory.

Then the executable <code>GEOSgcm.x</code> is run in the <code>scratch</code> directory, starting with the date in <code>cap_restart</code> and running for the length of a segment. A segment is the length of model time that the model integrates before returning, letting <code>gcm_run.j</code> do some housekeeping and then running another segment. A model job will typically run a number of segments before trying to resubmit itself, hopefully before the allotted wallclock time of the job runs out.

The processing that the various batch jobs perform is illustrated below:

[[Image:F2.5-job-diagram002.png]]

Each time a segment ends, <code>gcm_run.j</code> submits a post-processing job before starting a new segment or exiting. The post-processing job moves the model output from the <code>scratch</code> directory to the respective collection directory under <code>holding</code>. Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job. The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in <code>archive</code> filesystem. The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue. The plotting script produces a number of pre-determined plots as <code>.gif</code> files in the <code>plot_CLIM</code> directory in the experiment directory.

You can check on jobs in the queue with <code>qstat</code>. The jobs associated with the run will be named with the experiment name appended with the type of job it is: RUN, POST, ARCH or PLT.

As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output. The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code> (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.

=== Tuning a run ===

Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are important for understanding how the batch jobs work are <code>JOB_SGMT</code>, the length of the segment, and <code>NUM_SGMT</code>, the number of segments that the job tries to run before resubmitting itself and exiting. <code>JOB_SGMT</code> is in the format of ''YYYYMMDD HHMMSS'' (but usually expressed in days) and <code>NUM_SGMT</code> as an integer, so the multiple of the two is the total model time that a job will attempt to run. It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota. So to tune for the maximum number of segments in a job, it is usually best to manipulate <code>JOB_SGMT</code>.

== Determining Output: <code>HISTORY.rc</code> ==

The contents of the the file <code>HISTORY.rc</code> (in your experiment <code>HOME</code> directory) tell the model what and how to output its state and diagnostic fields. The default <code>HISTORY.rc</code> provides many fields as is, but you may want to modify it to suit your needs.

===File format===

The top of a default <code>HISTORY.rc</code> will look something like this:

<pre>

EXPID: myexp42

EXPDSC: this_is_my_experiment

COLLECTIONS: 'geosgcm_prog'

'geosgcm_surf'

'geosgcm_moist'

'geosgcm_turb'

</pre>

[....]

The attribute <code>EXPID</code> must match the name of the experiment <code>HOME</code> directory; this is only an issue if you copy the <code>HISTORY.rc</code> from a different experiment. The <code>EXPDSC</code> attribute is used to label the plots. The <code>COLLECTIONS</code> attribute contains list of strings indicating the output collections to be created. The content of the individual collections are determined after this list. Individual collections can be "turned off" by commenting the relevant line with a <code>#</code>.

The following is an example of a collection specification:

<pre>

geosgcm_prog.template: '%y4%m2%d2_%h2%n2z.nc4',

geosgcm_prog.archive: '%c/Y%y4',

geosgcm_prog.format: 'CFIO',

geosgcm_prog.frequency: 060000,

geosgcm_prog.resolution: 144 91,

geosgcm_prog.vscale: 100.0,

geosgcm_prog.vunit: 'hPa',

geosgcm_prog.vvars: 'log(PLE)' , 'DYN' ,

geosgcm_prog.levels: 1000 975 950 925 900 875 850 825 800 775 750 725 700 650 600 550 500 450 400 350 300 250 200 150 100 70 50 40 30 20 10 7 5 4 3 2 1 0.7 0.5 0.4 0.3 0.2

0.1 0.07 0.05 0.04 0.03 0.02,

geosgcm_prog.fields: 'PHIS' , 'AGCM' ,

'T' , 'DYN' ,

'PS' , 'DYN' ,

'ZLE' , 'DYN' , 'H' ,

'OMEGA' , 'DYN' ,

'Q' , 'MOIST' , 'QV' ,

::

</pre>

The individual collection attributes are described below, but what users modify the most are the <code>fields</code> attribute. This determines which exports are saved in the collection. Each field record is a string with the name of an export from the model followed by a string with the name of the gridded component which exports it, separated by a comma. The entries with a third column determine the name by which that export in saved in the collection file when the name is different from that of the export.

There is a good description of available collection options at Modeling Guru: https://modelingguru.nasa.gov/docs/DOC-1190

===What exports are available?===

To add export fields to the <code>HISTORY.rc</code> you will need to know what fields the model provides, which gridded component provides them, and their name. The most straightforward way to do this is to use <code>PRINTSPEC</code>. The setting for <code>PRINTSPEC</code> is in the file <code>CAP.rc</code>. By default the line looks like so:

PRINTSPEC: 0 # (0: OFF, 1: IMPORT & EXPORT, 2: IMPORT, 3: EXPORT)

Setting <code>PRINTSPEC</code> to 3 will make the model send to standard output a list of exports available to <code>HISTORY.rc</code> in the model's current configuration, and then exit without integrating. The list includes each export's gridded component and short name (both necessary to include in <code>HISTORY.rc</code>), long (descriptive) name, units, and number of dimensions. Note that run-time options can affect the exports available, so see to it that you have those set as you intend. The other <code>PRINTSPEC</code> values are useful for debugging.

While you can set <code>PRINTSPEC</code>, submit <code>sbatch gcm_run.j</code>, and get the export list as part of PBS standard output, there are quicker ways of obtaining the list. One way is to run it as a single column model on a single processor, as explained in [[Jason Single Column Model]]. Another way is to run it in an existing experiment. In the <code>scratch</code> directory of an experiment that has already run, change <code>PRINTSPEC</code> in <code>CAP.rc</code> as above. Then, in the file <code>AGCM.rc</code>, change the values of <code>NX</code> and <code>NY</code> (near the beginning of the file) to 1. Then, from an interactive job (one processor will suffice), run the executable <code>GEOSgcm.x</code> in <code>scratch</code>. You will need to run <code>source src/g5_modules</code> in the model's build tree to set up the environment. The model executable will simply output the export list to <code>stdout</code>.

===Outputting Derived Fields===

In addition to writing export fields created by model components (we will refer to these as model fields), the user may specify new fields that will be evaluated using the MAPL parser. These will be referred to as derived fields in the following discussion. The derived fields are evaluated using an expression that involves other fields in the collection as variables. The expression is evaluated element by element to create a new field. Derived fields are specified like a regular field from a gridded component in a history collection with 3 comma separated strings. The difference is now that in place of a variable name string, an expression string that will be evaluated is entered. Following this comes the string specifying the gridded component. You MUST put a string here, which should be the name of a gridded component. Finally a string MUST be entered which is the name of the new variable. This will be the name of the variable in the output file. In general the expression entered will involve variables, functions, and real numbers. The derived fields are evaluated before time and spatial (vertical and horizontal) averaging.

Here are some rules about expressions

#Fields in expression can only be model fields.

#If the model field has an alias you must use the alias in the expression.

#You can not mix center and edge fields in an expression. You can mix 2D and 3D fields if the 3D fields are all center or edge. In this case each level of the 3D field operated with the 2D field. Another way to think of this is that in an expression involving a 2D and 3D field the 2D field gets promoted to a 3D field with the same data in each level.

#When parsing an expression the parser first checks if the fields in an expression are part of the collection. Any model field in a collection can be used in an expression in the same collection. However, there might be cases where you wish to output an expression but not the model fields used in the expression. In this case if the parser does not find the field in the collection it checks the gridded component name after the expression for the model field. If the field is found in the gridded component it can use it in the expression. Note that if you have an expression with two model fields from different gridded components you can not use this mechanism to output the expression without outputting either field. One of them must be in the collection.

#The alias of an expression can not be used in a subsequent expression.

Here are the rules for the expressions themselves

The following can appear in the expression string

#The function string can contain the following mathematical operators +, -, *, /, ^ and ()

#Variable names - Parsing of variable names is case sensitive.

#The following single argument fortran intrinsic functions and user defined functions are implmented: exp, log10, log, sqrt, sinh, cosh, tanh, sin, cos, tan, asin, acos, atan, heav (the Heaviside step function). Parsing of functions is case insensitive.

#Integers or real constants. To be recognized as explicit constants these must conform to the format [+|-][nnn][.nnn][e|E|d|D][+|-][nnn] where nnn means any number of digits. The mantissa must contain at least one digit before or following an optional decimal point. Valid exponent identifiers are 'e', 'E', 'd' or 'D'. If they appear they must be followed by a valid exponent!

#Operations are evaluated in the order

##expressions in brackets

##-X unary minux

##X^Y exponentiation

##X*Y X/Y multiplicaiton and division

##A+B X-Y addition and subtraction

In the following example we create a collection that has three derived fields, the magnitude of the wind, the temperature in farenheit, and temperature cubed:

<pre>

geosgcm_prog.template: '%y4%m2%d2_%h2%n2z.nc4',

geosgcm_prog.archive: '%c/Y%y4',

geosgcm_prog.format: 'CFIO',

geosgcm_prog.frequency: 060000,

geosgcm_prog.resolution: 144 91,

geosgcm_prog.fields: 'U' , 'DYN' ,

'V' , 'DYN' ,

'T' , 'DYN' ,

'sqrt(U*U+V*V)' , 'DYN' , 'Wind_Magnitude' ,

'(T-273.15)*1.8+32.0' , 'DYN' , 'TF' ,

'T^3' , 'DYN', 'T3' ,

::

</pre>

----

Line 252:

Line 430:

'''Back to [[Documentation for GEOS GCM v10]]'''

~~Contact Matthew Thompson~~ at ~~GMAO with questions and comments~~

If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov

@@ Line 1: / Line 1: @@
 This page describes the minimum steps required to build and run GEOS GCM on NCCS discover and NAS pleiades.  '''You should successfully complete the steps in these instructions before doing anything more complicated.  Also, it is helpful to read this page in its entirety before starting.'''
+If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov
 '''Back to [[Documentation for GEOS GCM v10]]'''
-== How to Obtain GEOS GCM and Compile Source Code ==
+= How to build GEOS GCM =
+== Preliminary Steps ==
+=== Load Build Modules ===
+In your <code>.bashrc</code> or <code>.tcshrc</code> or other rc file add a line:
+==== NCCS ====
+ module use -a /discover/swdev/gmao_SIteam/modulefiles-SLES12
-There are two options for obtaining the model source code: from the CVS repository on the NCCS progress server, and from the SVN "public" repository on the trac server.  Since the code on progress is more current, elgible users are strongly encouraged to obtain accounts from NCCS and use the progress repository.
+==== NAS ====
-=== Using the NCCS CVS code repository ===
+ module use -a /nobackup/gmao_SIteam/modulefiles
-The following assumes that you know your way around Unix, have successfully logged into your cluster account and have an account on the source code repository with the proper <code>ssh</code> configuration -- see the NCCS repository quick start pages at: https://www.nccs.nasa.gov/trac/admin/wiki/QuickStart. The link requires your NCCS username and password. The recommend SSH config setup for CVS on discover is:
+==== GMAO Desktops ====
- Host cvsacldirect
+On the GMAO desktops, the SI Team modulefiles should automatically be part of running <code>module avail</code> but if not, they are in:
-     HostName cvsacl.nccs.nasa.gov
-     Port 22223
-That's it. Progress is not needed unless you specifically know you need it. It won't hurt to add it, but at present it isn't needed. Also, you'll need to generate RSA or ED25519 keys and upload them (this is mentioned in the quick start page above) to https://www.nccs.nasa.gov/keyupload/. (NOTE: DSA keys are not recommended as some sites, e.g., NAS, have started disallowing them.) The usual way of doing this is to go to your <tt>.ssh</tt> directory and run <tt>ssh-keygen</tt>, making a key with no password:
+ module use -a /ford1/share/gmao_SIteam/modulefiles
- $ cd $HOME/.ssh
+Also do this in any interactive window you have. This allows you to get module files needed to correctly checkout and build the model.
- $ ssh-keygen -o -a 100 -b 3072 -t rsa
-The commands below assume that your shell is <code>csh</code>.  Since the scripts to build and run GEOS tend to be written in the same, you shouldn't bother trying to import too much into an alternative shell.  If you prefer a different shell, it is easiest just to open a <code>csh</code> process to build the model and your experiment.
+Now load the <code>GEOSenv</code> module:
-Furthermore, model builds should be created in your space under <code>/discover/nobackup</code>, as creating them under your home directory will quickly wipe out your disk quota.
+ module load GEOSenv
-Set the following three environment variables:
+which obtains the latest <code>git</code>, <code>CMake</code>, and <code>mepo</code> modules.
- setenv CVS_RSH ssh
+== Cloning the Model ==
- setenv CVSROOT :ext:''USERID''@cvsacldirect:/cvsroot/esma
-where ''USERID'' is, of course, your repository username, which should be the same as your NASA and NCCS username.  Then, issue the command:
+GEOS is now hosted on GitHub. The first thing to do is to [https://github.com/join create a GitHub account] and [https://help.github.com/en/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account add your SSH key] to it.
- cvs co -r Jason-3_0 GEOSagcm
+You can then clone the model with:
-This should check out the latest stable version of the model from the repository and create a directory called <code>GEOSagcm</code>.
+ git clone -b v10.17.0 git@github.com:GEOS-ESM/GEOSgcm.git
-==== CVS Errors ====
+where <code>-b v10.17.0</code> refers to a release tag of GEOS GCM. Information on the various releases can be found on the [https://github.com/GEOS-ESM/GEOSgcm/releases Releases page].
-If the CVS checkout doesn't work for you, there are many possibilities.
+=== HTTPS Access ===
-===== Keyupload =====
+GEOS can also be cloned via https with:
-If the error says something about "keyupload", then the key upload either hasn't taken hold yet (can take 5-10 minutes to work after uploading the key), or, perhaps, the wrong key was uploaded.
+  git clone -b v10.17.0 https://github.com/GEOS-ESM/GEOSgcm.git
-===== Access denied for this host =====
+== Building GEOS ==
-If you see something like "access denied for this host", then your best bet is to contact NCCS. Per a response from NCCS to a user that had something similar happen, they need to add the CVS hosts to an LDAP entry.
+=== Single Step Building of the Model ===
-===== Failed to create lock/permission denied =====
+If all you wish is to build the model, you can run <code>parallel_build.csh</code> from a head node. Doing so will checkout all the external repositories of the model and build it. When done, the resulting model build will be found in <code>build/</code> and the installation will be found in <code>install/</code> with setup scripts like <code>gcm_setup</code> and <code>fvsetup</code> in <code>install/bin</code>.
-If you see something like:
+==== Develop Version of GEOS GCM ====
- cvs checkout: failed to create lock directory for `/cvsroot/esma/CVSROOT' (/cvsroot/esma/CVSROOT/#cvs.history.lock): Permission denied
+<code>parallel_build.csh</code> provides a special flag for checking out the development branches of GEOSgcm_GridComp and GEOSgcm_App. If you run:
- cvs checkout: failed to obtain history lock in repository `/cvsroot/esma'
- cvs checkout: Updating src
- cvs checkout: failed to create lock directory for `/cvsroot/esma/esma/src/Applications/GEOSdas' (/cvsroot/esma/esma/src/Applications/GEOSdas/#cvs.lock): Permission denied
- cvs checkout: failed to obtain dir lock in repository `/cvsroot/esma/esma/src/Applications/GEOSdas'
- cvs [checkout aborted]: read lock failed - giving up
-this means you don't have a home directory on progress. Try doing:
+<pre>parallel_build.csh -develop</pre>
+then <code>mepo</code> will run:
- $ ssh progress.nccs.nasa.gov
+<pre>mepo develop GEOSgcm_GridComp GEOSgcm_App</pre>
+==== Debug Version of GEOS GCM ====
-You'll enter your PASSCODE and password and then it'll seem like the terminal is "stuck". Just hit Ctrl-C. Now try the CVS command again.
+To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.
-=== Compiling the Model ===
+==== Debug Version of GEOS GCM ====
-First, you need to set <code>ESMADIR</code>. For example, if your <code>src/</code> directory is:
+To obtain a debug version, you can run <code>parallel_build.csh -debug</code> which will build with debugging flags. This will build in <code>build-Debug/</code> and install into <code>install-Debug/</code>.
- /discover/nobackup/mathomp4/Models/Jason-3_0/GEOSagcm/src
+=== Multiple Steps for Building the Model ===
-then you should set:
+The steps detailed below are essentially those that <code>parallel_build.csh</code> performs for you. Either method should yield identical builds.
- setenv ESMADIR /discover/nobackup/mathomp4/Models/Jason-3_0/GEOSagcm
+==== Mepo ====
-Next, we need to source <code>g5_modules</code> with:
+The GEOS GCM is comprised of a set of sub-repositories. These are managed by a tool called [https://github.com/GEOS-ESM/mepo mepo]. To clone all the sub-repos, you can run <code>mepo clone</code> inside the fixture:
- source $ESMADIR/src/g5_modules
+<pre>cd GEOSgcm
+mepo clone</pre>
+The first command initializes the multi-repository and the second one clones and assembles all the sub-repositories according to <code>components.yaml</code>
-This will set up the build environment.  If you then type
+==== Checking out develop branches of GEOSgcm_GridComp and GEOSgcm_App ====
- module list
+To get development branches of GEOSgcm_GridComp and GEOSgcm_App (a la the <code>-develop</code> flag for <code>parallel_build.csh</code>, one needs to run the equivalent <code>mepo</code> command. As mepo itself knows (via <code>components.yaml</code>) what the development branch of each subrepository is, the equivalent of <code>-develop</code> for <code>mepo</code> is to checkout the development branches of GEOSgcm_GridComp and GEOSgcm_App:
-you should see:
+<pre>mepo develop GEOSgcm_GridComp GEOSgcm_App</pre>
+This must be done ''after'' <code>mepo clone</code> as it is running a git command in each sub-repository.
+==== Build the Model ====
- Currently Loaded Modulefiles:
+===== Load Compiler, MPI Stack, and Baselibs =====
-) other/comp/gcc-6.3
-) comp/intel-18.0.1.163
-) mpi/sgi-mpt-2.17
-) lib/mkl-18.0.1.163
-) other/SIVO-PyD/spd_1.25.0_gcc-6.3_mkl-17.0.4.196
-If this all worked, then type:
+On tcsh:
- cd $ESMADIR/src
+<pre>source @env/g5_modules
- gmake install
+</pre>
+or on bash:
-This will build the model.  It will take about 30 minutes.  If this works, it should create a directory under <code>GEOSagcm</code> called <code>Linux/bin</code>.  In here you should find the executable: <code>GEOSgcm.x</code> .
+<pre>source @env/g5_modules.sh
+</pre>
+===== Create Build Directory =====
-== Setting up a Run ==
+We currently do not allow in-source builds of GEOSgcm. So we must make a directory:
-=== Passwordless Logins ===
+<pre>mkdir build
+</pre>
+The advantages of this is that you can build both a Debug and Release version with the same clone if desired.
+===== Run CMake =====
+CMake generates the Makefiles needed to build the model.
+<pre>cd build
+cmake .. -DBASEDIR=$BASEDIR/Linux -DCMAKE_Fortran_COMPILER=ifort -DCMAKE_INSTALL_PREFIX=../install
+</pre>
+This will install to a directory parallel to your <code>build</code> directory. If you prefer to install elsewhere change the path in:
+ -DCMAKE_INSTALL_PREFIX=&lt;path&gt;
+</pre>
+and CMake will install there.
+===== Build and Install with Make =====
+<pre>make -j6 install
+</pre>
+= Running GEOS GCM =
+== Passwordless Logins ==
 First of all, to run jobs on the cluster you will need to set up passwordless <code>ssh</code> (which operates within the cluster, between the nodes running the job).  To do so, run the following from your '''discover''' home directory:
@@ Line 112: / Line 143: @@
 Then, log into  '''dirac''' and cut and paste the contents of the <code>id_rsa.pub</code> file on '''discover''' into the  <code>~/.ssh/authorized_keys</code> file on   '''dirac'''.  Problems with <code>ssh</code> should be referred to NCCS support.
-==== DSA Keys ====
+=== DSA Keys ===
 Note: Due to evolution of security, it is recommended to not use DSA keys. NAS currently doesn't not allow them, and RSA and ED25519 keys are considered "better" anyway.
-=== Setting up a model run ===
+== Setting up a model run ==
-To set the model up to run, cd to <code>GEOSagcm/src/Applications/GEOSgcm_App</code> and run:
+Once the model has built successfully, you will have an <code>install/</code> directory in your checkout. To run <code>gcm_setup</code> go to the <code>install/bin/</code> directory and run it there:
+ cd install/bin
   ./gcm_setup
@@ Line 130: / Line 162: @@
   Enter a 1-line Experiment Description:
-Spaces are ok here.  It will then ask for the source code version tag to associate with the model -- you should hit enter for the default:
+Spaces are ok here.  Next it will ask if you wish to CLONE an experiment. If yes, you can point this to another experiment and the setup script will try and duplicate all the RC, etc. files. For now, though, choose NO to create a new experiment.
- Enter an Experiment Source Tag for History (Default: Jason-3_0):
-Hit enter for the default to the next question:
   Do you wish to CLONE an old experiment? (Default: NO or FALSE)
-It will also ask you for the atmospheric model resolution, expecting the code for one of the displayed resolutions.
+It will now ask you for the atmospheric model resolution, expecting the code for one of the displayed resolutions.
-  Enter the Atmospheric Horizontal Resolution code:
+  <nowiki>Enter the Atmospheric Horizontal Resolution code:
- -----------------------------------------------------------
+--------------------------------------
-      Lat/Lon                     Cubed-Sphere
+            Cubed-Sphere
- -----------------------------------------------------------
+--------------------------------------
-    b --  2  deg                c48  --  2   deg
+   c48  --  2   deg
-    c --  1  deg                c90  --  1   deg
+   c90  --  1   deg
-    d -- 1/2 deg                c180 -- 1/2  deg (56-km)
+   c180 -- 1/2  deg (56-km)
-    e -- 1/4 deg (35-km)        c360 -- 1/4  deg (28-km)
+   c360 -- 1/4  deg (28-km)
-                                c720 -- 1/8  deg (14-km)
+   c720 -- 1/8  deg (14-km)
-                                c1440 - 1/16 deg ( 7-km)
+   c1440 - 1/16 deg ( 7-km)
+             DYAMOND Grids
+   c768 -- 1/8  deg (12-km)
+   c1536 - 1/16 deg ( 6-km)
+   c3072 - 1/32 deg ( 3-km)</nowiki>
-For your first time out you will probably want to enter <code>c48</code> (corresponding to ~2 degree resolution with the cubed sphere).  On the next eight questions, hitting enter to accept the default will let you run a PChem run:
+For your first time out you will probably want to enter <code>c48</code> (corresponding to ~2 degree resolution with the cubed sphere).
+Next it will ask you about the vertical resolution:
   Enter the Atmospheric Model Vertical Resolution: LM (Default: 72)
+The next question is about using IOSERVER:
+  Do you wish to IOSERVER? (Default: NO or FALSE)
+The "default" answer to this will change depending on the resolution you choose. For now, just accept the default.
+Next is a question that asks what processor you wish to run on. For example, on discover at NCCS:
+ Enter the Processor Type you wish to run on:
+    hasw (Haswell) (default)
+    sky  (Skylake)
+NOTE: At present you need access to special queues to use the Skylake, so choosing Haswell is usually a better option.
+After this are questions involving the ocean model:
   Do you wish to run the COUPLED Ocean/Sea-Ice Model? (Default: NO or FALSE)
@@ Line 160: / Line 210: @@
                                                    o2 (1/4-deg, 1440x720  MERRA-2)
                                                    o3 (1/8-deg, 2880x1440 OSTIA)
+                                                  CS (Cubed-Sphere OSTIA)
-  Do you wish to run GOCART with Actual or Climatological Aerosols? (Enter: A (Default) or C)
+Then Land model:
+  Enter the choice of  Land Surface Boundary Conditions using: 1 (Default: Icarus), 2 (Latest Icarus-NL)
+Then the aerosols:
+ Do you wish to run GOCART with Actual or Climatological Aerosols? (Enter: A (Default) or C)
   Enter the GOCART Emission Files to use: MERRA2 (Default), PIESA, CMIP, NR, MERRA2-DD or OPS:
+After this are some questions about various setups in the model. The default is often your best bet.
   Enter the tag or directory (/filename) of the HISTORY.AGCM.rc.tmpl to use
@@ Line 176: / Line 234: @@
   Hit ENTER to use Default Location:
   ----------------------------------
-  Default:  /discover/nobackup/''USER''/''EXPID''
+  Default:  ~''USER''/geos5/''EXPID''
+  /discover/nobackup/''USER''/''EXPID''
   Enter Desired Location for the EXPERIMENT Directory (to contain model output and restart files)
   Hit ENTER to use Default Location:
   ----------------------------------
-  Default:  ~''USER''/geos5/''EXPID''
+  Default:  /discover/nobackup/''USER''/''EXPID''
-  /discover/nobackup/''USER''/''EXPID''
   Enter Location for Build directory containing:  src/ Linux/ etc...
   Hit ENTER to use Default Location:
   ----------------------------------
-  Default:  /discover/nobackup/''USER''/GEOSagcm
+  Default:  /discover/nobackup/''USER''/GEOSgcm/install
 After these it will ask you for a group ID -- the default for this writer is g0620 (GMAO modeling group).  Enter whatever is appropriate, as necessary.
@@ Line 194: / Line 252: @@
   Enter your GROUP ID for Current EXP: (Default: g0620)
   -----------------------------------
 The script will produce some messages and create an experiment directory (''EXPDIR'') in your space as <code>/discover/nobackup/''USERID''/''EXPID''</code>, which contains the files and sub-directories:
@@ Line 213: / Line 270: @@
 *<code>regress/</code> -- contains scripts for doing regression testing of model
 *<code>src</code> -- directory with a tarball of the model version's source code
 The post-processing script will generate the archiving and plotting scripts as it runs.  The setup script that you ran also creates an experiment home directory (''HOMDIR'') as either in <code>~''USERID''/geos5/''EXPID''</code> (if you accepted the default) or in <code>/discover/nobackup/''USERID''/''EXPID''</code> (if you followed the above advice) containing the run scripts and GEOS resource (<code>.rc</code>) files.
-== Running GEOS-5 ==
+== Running GEOS ==
-Before running the model, there is some more setup to be completed.  The run scripts need some environment variables set in <code>~/.cshrc</code> (regardless of which login shell you use -- the GEOS-5 scripts use <code>csh</code>).  Here are the minimum contents of a <code>.cshrc</code>:
+Before running the model, there is some more setup to be completed.  The run scripts need some environment variables set in <code>~/.cshrc</code> (regardless of which login shell you use -- the GEOS scripts use <code>csh</code>).  Here are the minimum contents of a <code>.cshrc</code>:
   umask 022
@@ Line 228: / Line 284: @@
 The <code>umask 022</code> is not strictly necessary, but it will make the various files readable to others, which will facilitate data sharing and user support.  Your home directory <code>~''USERID''</code> is also inaccessible to others by default; running <code>chmod 755 ~</code> is helpful.
-Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''.  You can get an arbitrary set of restarts by copying the contents of the directory <code>/discover/nobackup/mathomp4/Restarts-I30/nc4/Reynolds/c48</code>, containing 2-degree cubed sphere restarts from April 14, 2000, and their corresponding <code>cap_restart</code>.
+Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''.  You can get an arbitrary set of restarts by copying the contents of the directory <code>/discover/nobackup/mathomp4/Restarts-J10/nc4/Reynolds/c48</code>, containing 2-degree cubed sphere restarts from April 14, 2000, and their corresponding <code>cap_restart</code>.
 The script you submit, <code>gcm_run.j</code>, is in ''HOMEDIR''.  It should be ready to go as is.  The parameter END_DATE in <code>CAP.rc</code> can be set to the date you want the run to stop.  Submit the job with <code>sbatch gcm_run.j</code>.  You can keep track of it with <code>squeue</code> or <code>squeue -u ''USERID''</code>, or follow stdout with <code>tail -f ''EXPDIR''/slurm-''JOBID''.out</code>, ''JOBID'' being returned by <code>sbatch</code> and displayed with <code>squeue</code>.  Jobs can be killed with <code>scancel ''JOBID''</code>.
@@ Line 247: / Line 303: @@
 The contents of the output files (including which variables get saved) may be configured in the  <code>''HOMEDIR''/HISTORY.rc</code> -- a good description of this file may be found at http://modelingguru.nasa.gov/clearspace/docs/DOC-1190 .
+== What Happens During a Run ==
+When the script <code>gcm_run.j</code> starts running, it creates a directory called  <code>scratch</code> and copies or links into it the model executable, rc files, restarts and boundary conditions necessary to run the model.  It also creates a directory for each of the output collections (in the default setup with the suffix <code>geosgcm_</code>) in the directory <code>holding</code> for before post-processing, and in the experiment directory for after post-processing.  It also tars the restarts and moves the tarball to the <code>restarts</code> directory.
+Then the executable  <code>GEOSgcm.x</code> is run in the <code>scratch</code> directory, starting with the date in  <code>cap_restart</code> and running for the length of a segment.  A segment is the length of model time that the model integrates before returning, letting <code>gcm_run.j</code> do some housekeeping and then running another segment.  A model job will typically run a number of segments before trying to resubmit itself, hopefully before the allotted wallclock time of the job runs out.
+The processing that the various batch jobs perform is illustrated below:
+[[Image:F2.5-job-diagram002.png]]
+Each time a segment ends, <code>gcm_run.j</code> submits a post-processing job before starting a new segment or exiting.  The post-processing job moves the model output from the  <code>scratch</code> directory to the respective collection directory under  <code>holding</code>.  Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job.  The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in <code>archive</code> filesystem.  The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue.  The plotting script produces a number of pre-determined plots as <code>.gif</code> files in the <code>plot_CLIM</code> directory in the experiment directory.
+You can check on jobs in the queue with <code>qstat</code>.  The jobs associated with the run will be named with the experiment name appended with the type of job it is: RUN, POST, ARCH or PLT.
+As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output.  The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code>  (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.
+=== Tuning a run ===
+Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are important for understanding how the batch jobs work are <code>JOB_SGMT</code>, the length of the segment, and <code>NUM_SGMT</code>, the number of segments that the job tries to run before resubmitting itself and exiting.  <code>JOB_SGMT</code> is in the format of ''YYYYMMDD HHMMSS'' (but usually expressed in days) and <code>NUM_SGMT</code> as an integer, so the multiple of the two is the total model time that a job will attempt to run.  It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota.  So to tune for the maximum number of segments in a job, it is usually best to manipulate <code>JOB_SGMT</code>.
+== Determining Output: <code>HISTORY.rc</code> ==
+The contents of the the file <code>HISTORY.rc</code> (in your experiment <code>HOME</code> directory) tell the model what and how to output its state and diagnostic fields.  The default <code>HISTORY.rc</code> provides many fields as is, but you may want to modify it to suit your needs.
+===File format===
+The top of a default <code>HISTORY.rc</code> will look something like this:
+<pre>
+EXPID:  myexp42
+EXPDSC: this_is_my_experiment
+COLLECTIONS: 'geosgcm_prog'
+             'geosgcm_surf'
+             'geosgcm_moist'
+             'geosgcm_turb'
+</pre>
+[....]
+The attribute <code>EXPID</code> must match the name of the experiment <code>HOME</code> directory; this is only an issue if you copy the  <code>HISTORY.rc</code> from a different experiment.  The <code>EXPDSC</code> attribute is used to label the plots.  The <code>COLLECTIONS</code> attribute contains list of strings indicating the output collections to be created.  The content of the individual collections are determined after this list.  Individual collections can be "turned off" by commenting the relevant line with a <code>#</code>.
+The following is an example of a collection specification:
+<pre>
+  geosgcm_prog.template:  '%y4%m2%d2_%h2%n2z.nc4',
+  geosgcm_prog.archive:   '%c/Y%y4',
+  geosgcm_prog.format:    'CFIO',
+  geosgcm_prog.frequency:  060000,
+  geosgcm_prog.resolution: 144 91,
+  geosgcm_prog.vscale:     100.0,
+  geosgcm_prog.vunit:     'hPa',
+  geosgcm_prog.vvars:     'log(PLE)' , 'DYN'          ,
+  geosgcm_prog.levels:     1000 975 950 925 900 875 850 825 800 775 750 725 700 650 600 550 500 450 400 350 300 250 200 150 100 70 50 40 30 20 10 7 5 4 3 2 1 0.7 0.5 0.4 0.3 0.2
+.1 0.07 0.05 0.04 0.03 0.02,
+  geosgcm_prog.fields:    'PHIS'     , 'AGCM'         ,
+                          'T'        , 'DYN'          ,
+                          'PS'       , 'DYN'          ,
+                          'ZLE'      , 'DYN'          , 'H'   ,
+                          'OMEGA'    , 'DYN'          ,
+                          'Q'        , 'MOIST'        , 'QV'  ,
+                          ::
+</pre>
+The individual collection attributes are described below, but what users modify the most are the <code>fields</code> attribute.  This determines which exports are saved in the collection.  Each field record is a string with the name of an export from the model followed by a string with the name of the gridded component which exports it, separated by a comma.  The entries with a third column determine the name by which that export in saved in the collection file when the name is different from that of the export.
+There is a good description of available collection options at Modeling Guru: https://modelingguru.nasa.gov/docs/DOC-1190
+===What exports are available?===
+To add export fields to the <code>HISTORY.rc</code> you will need to know what fields the model provides, which gridded component provides them, and their name.  The most straightforward way to do this is to use <code>PRINTSPEC</code>.  The setting for  <code>PRINTSPEC</code> is in the file <code>CAP.rc</code>.  By default the line looks like so:
+ PRINTSPEC: 0  # (0: OFF, 1: IMPORT & EXPORT, 2: IMPORT, 3: EXPORT)
+Setting <code>PRINTSPEC</code> to  3 will make the model send to standard output a list of exports available to <code>HISTORY.rc</code> in the model's current configuration, and then exit without integrating. The list includes each export's gridded component and short name (both necessary to include in <code>HISTORY.rc</code>), long (descriptive) name, units, and number of dimensions.  Note that run-time options can affect the exports available, so see to it that you have those set as you intend.  The other <code>PRINTSPEC</code> values are useful for debugging.
+While you can set  <code>PRINTSPEC</code>, submit <code>sbatch gcm_run.j</code>, and get the export list as part of PBS standard output, there are quicker ways of obtaining the list.  One way is to run it as a single column model on a single processor, as explained in [[Jason Single Column Model]].  Another way is to run it in an existing experiment.  In the <code>scratch</code> directory of an experiment that has already run, change <code>PRINTSPEC</code> in  <code>CAP.rc</code> as above.  Then, in the file <code>AGCM.rc</code>, change the values of <code>NX</code> and <code>NY</code> (near the beginning of the file) to 1.  Then, from an interactive job (one processor will suffice), run the executable <code>GEOSgcm.x</code> in <code>scratch</code>.  You will need to run <code>source src/g5_modules</code> in the model's build tree to set up the environment.  The model executable will simply output the export list to <code>stdout</code>.
+===Outputting Derived Fields===
+In addition to writing export fields created by model components (we will refer to these as model fields), the user may specify new fields that will be evaluated using the MAPL parser. These will be referred to as derived fields in the following discussion. The derived fields are evaluated using an expression that involves other fields in the collection as variables. The expression is evaluated element by element to create a new field. Derived fields are specified like a regular field from a gridded component in a history collection with 3 comma separated strings. The difference is now that in place of a variable name string, an expression string that will be evaluated is entered. Following this comes the string specifying the gridded component. You MUST put a string here, which should be the name of a gridded component. Finally a string MUST be entered which is the name of the new variable. This will be the name of the variable in the output file. In general the expression entered will involve variables, functions, and real numbers. The derived fields are evaluated before time and spatial (vertical and horizontal) averaging.
+Here are some rules about expressions
+#Fields in expression can only be model fields.
+#If the model field has an alias you must use the alias in the expression.
+#You can not mix center and edge fields in an expression. You can mix 2D and 3D fields if the 3D fields are all center or edge. In this case each level of the 3D field operated with the 2D field. Another way to think of this is that in an expression involving a 2D and 3D field the 2D field gets promoted to a 3D field with the same data in each level.
+#When parsing an expression the parser first checks if the fields in an expression are part of the collection. Any model field in a collection can be used in an expression in the same collection. However, there might be cases where you wish to output an expression but not the model fields used in the expression. In this case if the parser does not find the field in the collection it checks the gridded component name after the expression for the model field. If the field is found in the gridded component it can use it in the expression. Note that if you have an expression with two model fields from different gridded components you can not use this mechanism to output the expression without outputting either field. One of them must be in the collection.
+#The alias of an expression can not be used in a subsequent expression.
+Here are the rules for the expressions themselves
+The following can appear in the expression string
+#The function string can contain the following mathematical operators +, -, *, /, ^ and ()
+#Variable names - Parsing of variable names is case sensitive.
+#The following single argument fortran intrinsic functions and user defined functions are implmented: exp, log10, log, sqrt, sinh, cosh, tanh, sin, cos, tan, asin, acos, atan, heav (the Heaviside step function). Parsing of functions is case insensitive.
+#Integers or real constants. To be recognized as explicit constants these must conform to the format [+|-][nnn][.nnn][e|E|d|D][+|-][nnn] where nnn means any number of digits. The mantissa must contain at least one digit before or following an optional decimal point. Valid exponent identifiers are 'e', 'E', 'd' or 'D'. If they appear they must be followed by a valid exponent!
+#Operations are evaluated in the order
+##expressions in brackets
+##-X      unary minux
+##X^Y  exponentiation
+##X*Y X/Y multiplicaiton and division
+##A+B X-Y addition and subtraction
+In the following example we create a collection that has three derived fields, the magnitude of the wind, the temperature in farenheit, and temperature cubed:
+<pre>
+  geosgcm_prog.template:  '%y4%m2%d2_%h2%n2z.nc4',
+  geosgcm_prog.archive:   '%c/Y%y4',
+  geosgcm_prog.format:    'CFIO',
+  geosgcm_prog.frequency:  060000,
+  geosgcm_prog.resolution: 144 91,
+  geosgcm_prog.fields:    'U'             , 'DYN'          ,
+                          'V'             , 'DYN'          ,
+                          'T'             , 'DYN'          ,
+                          'sqrt(U*U+V*V)' , 'DYN'          , 'Wind_Magnitude'   ,
+                          '(T-273.15)*1.8+32.0' , 'DYN'    , 'TF' ,
+                          'T^3'           , 'DYN',         'T3' ,
+                          ::
+</pre>
 ----
@@ Line 252: / Line 430: @@
 '''Back to [[Documentation for GEOS GCM v10]]'''
-Contact Matthew Thompson at GMAO with questions and comments
+If you have any issues or questions, please email the GMAO SI Team at siteam_AT_gmao.gsfc.nasa.gov