Fortuna 2.5 User's Guide: Difference between revisions

(14 intermediate revisions by the same user not shown)

Line 191:

[[Image:F2.5-job-diagram002.png]]

Each time a segment ends, <code>gcm_run.j</code> submits a post-processing job before starting a new segment or exiting. The post-processing job moves the model output from the <code>scratch</code> directory to the respective collection directory under <code>holding</code>. Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job. The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in <code>archive</code> filesystem. The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue. The plotting script produces a number of pre-determined plots as <code>.gif</code> files in the <code>plot_CLIM</code> directory in the experiment directory.

As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output. The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code> (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.

===Tuning a run===

Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are important for understanding how the batch jobs work are <code>JOB_SGMT</code>, the length of the segment, and <code>NUM_SGMT</code>, the number of segments that the job tries to run before resubmitting itself and exiting. <code>JOB_SGMT</code> is in the format of ''YYYYMMDD HHMMSS'' (but usually expressed in days) and <code>NUM_SGMT</code> as an integer, so the multiple of the two is the total model time that a job will attempt to run. It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota. So to tune for the maximum number of segments in a job, it is usually best to manipulate <code>JOB_SGMT</code>.

== Determining Output: <code>HISTORY.rc</code> ==

The contents of the the file <code>HISTORY.rc</code> (in your experiment <code>HOME</code> directory) tell the model what and how to output its state and diagnostic fields. The default <code>HISTORY.rc</code> provides many fields as is, but you may want to modify it to suit your needs.

===File format===

The top of a default <code>HISTORY.rc</code> will look something like this:

<pre>

EXPID: myexp42

EXPDSC: this_is_my_experiment

COLLECTIONS: 'geosgcm_prog'

'geosgcm_surf'

'geosgcm_moist'

'geosgcm_turb'

</pre>

[....]

The attribute <code>EXPID</code> must match the name of the experiment <code>HOME</code> directory; this is only an issue if you copy the <code>HISTORY.rc</code> from a different experiment. The <code>EXPDSC</code> attribute is used to label the plots. The <code>COLLECTIONS</code> attribute contains list of strings indicating the output collections to be created. The content of the individual collections are determined after this list. Individual collections can be "turned off" by commenting the relevant line with a <code>#</code>.

The following is an example of a collection specification:

~~Each time a segment ends~~, <code>~~gcm_run.j~~</code> ~~submits~~ a ~~post-processing job before starting~~ a ~~new segment or exiting~~. The ~~post-processing job moves~~ the ~~model output~~ from the <code>~~scratch~~</code> ~~directory~~ to the ~~respective collection directory under~~ <code>~~holding~~</code>. ~~Then it determines whether there~~ is a ~~enough output~~ to ~~create a monthly or seasonal mean~~, and ~~if so, creates them~~ and ~~moves them~~ to ~~the collection directories~~ in ~~the experiment directory~~, and ~~then tars up~~ the ~~daily output and submits an archiving job~~. The ~~archiving job tries to move~~ the ~~tarred daily~~ output, the ~~monthly and seasonal means and any tarred restarts~~ to the ~~user's space~~ in <code>~~archive~~</code> ~~filesystem~~. ~~The post-processing script also determines~~ (~~assuming~~ the ~~default settings~~) ~~whether enough output exists~~ to ~~create plots; if so~~, ~~a plotting~~ job ~~is submitted to~~ the ~~queue~~. ~~The plotting script produces a number of pre-determined plots as~~ <code>~~.gif~~</code> ~~files~~ in the <code>~~plot_CLIM~~</code> ~~directory in the experiment directory~~.

<pre>

geosgcm_prog.template: '%y4%m2%d2_%h2%n2z.nc4',

geosgcm_prog.archive: '%c/Y%y4',

geosgcm_prog.format: 'CFIO',

geosgcm_prog.frequency: 060000,

geosgcm_prog.resolution: 144 91,

geosgcm_prog.vscale: 100.0,

geosgcm_prog.vunit: 'hPa',

geosgcm_prog.vvars: 'log(PLE)' , 'DYN' ,

geosgcm_prog.levels: 1000 975 950 925 900 875 850 825 800 775 750 725 700 650 600 550 500 450 400 350 300 250 200 150 100 70 50 40 30 20 10 7 5 4 3 2 1 0.7 0.5 0.4 0.3 0.2

0.1 0.07 0.05 0.04 0.03 0.02,

geosgcm_prog.fields: 'PHIS' , 'AGCM' ,

'T' , 'DYN' ,

'PS' , 'DYN' ,

'ZLE' , 'DYN' , 'H' ,

'OMEGA' , 'DYN' ,

'Q' , 'MOIST' , 'QV' ,

::

</pre>

The individual collection attributes are described below, but what users modify the most are the <code>fields</code> attribute. This determines which exports are saved in the collection. Each field record is a string with the name of an export from the model followed by a string with the name of the gridded component which exports it, separated by a comma. The entries with a third column determine the name by which that export in saved in the collection file when the name is different from that of the export.

===What exports are available?===

To add export fields to the <code>HISTORY.rc</code> you will need to know what fields the model provides, which gridded component provides them, and their name. The most straightforward way to do this is to use <code>PRINTSPEC</code>. The setting for <code>PRINTSPEC</code> is in the file <code>CAP.rc</code>. By default the line looks like so:

PRINTSPEC: 0 # (0: OFF, 1: IMPORT & EXPORT, 2: IMPORT, 3: EXPORT)

Setting <code>PRINTSPEC</code> to 3 will make the model send to standard output a list of exports available to <code>HISTORY.rc</code> in the model's current configuration, and then exit without integrating. The list includes each export's gridded component and short name (both necessary to include in <code>HISTORY.rc</code>), long (descriptive) name, units, and number of dimensions. Note that run-time options can affect the exports available, so see to it that you have those set as you intend. The other <code>PRINTSPEC</code> values are useful for debugging.

While you can set <code>PRINTSPEC</code>, submit <code>qsub gcm_run.j</code>, and get the export list as part of PBS standard output, there are quicker ways of obtaining the list. One way is to run it as a single column model on a single processor, as explained in [[Fortuna 2.5 Single Column Model]]. Another way is to run it in an existing experiment. In the <code>scratch</code> directory of an experiment that has already run, change <code>PRINTSPEC</code> in <code>CAP.rc</code> as above. Then, in the file <code>AGCM.rc</code>, change the values of <code>NX</code> and <code>NY</code> (near the beginning of the file) to 1. Then, from an interactive job (one processor will suffice), run the executable <code>GEOSgcm.x</code> in <code>scratch</code>. You will need to run <code>source src/g5_modules</code> in the model's build tree to set up the environment. The model executable will simply output the export list to <code>stdout</code>.

== Special Requirements ==

=== Perpetual ("Groundhog Day") mode ===

GEOS-5 Fortuna 2.5 and later can be run in "perpetual mode", automatically running with the same forcings for a time period delineated as a calendar year, month or day. The time period desired is set in <code>CAP.rc</code> with the parameters <code>PERPETUAL_YEAR</code>, <code>PERPETUAL_MONTH</code> and <code>PERPETUAL_DAY</code>. Set all three to run with the forcings for a particular day, and <code>NUM_SGMT</code> to how many times you wish to run it -- the history collection files will be appended with dates starting with the one in <code>cap_restart</code> and generally incrementing for the number of days in <code>NUM_SGMT</code>.

As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output. The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code> (format ''YYYYMMDD HHMMSS'', with a ~~space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.~~

=== Saving restarts during a segment ===

~~Most of the other properties in <code>CAP~~.rc~~</code> are discussed elsewhere, but two that are importat for understanding how the batch jobs work are JOB_SGMT: NUM_SGMT:~~

=== post.rc ===

@@ Line 191: / Line 191: @@
 [[Image:F2.5-job-diagram002.png]]
+Each time a segment ends, <code>gcm_run.j</code> submits a post-processing job before starting a new segment or exiting.  The post-processing job moves the model output from the  <code>scratch</code> directory to the respective collection directory under  <code>holding</code>.  Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job.  The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in <code>archive</code> filesystem.  The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue.  The plotting script produces a number of pre-determined plots as <code>.gif</code> files in the <code>plot_CLIM</code> directory in the experiment directory.
+As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output.  The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code>  (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.
+===Tuning a run===
+Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are important for understanding how the batch jobs work are <code>JOB_SGMT</code>, the length of the segment, and <code>NUM_SGMT</code>, the number of segments that the job tries to run before resubmitting itself and exiting.  <code>JOB_SGMT</code> is in the format of ''YYYYMMDD HHMMSS'' (but usually expressed in days) and <code>NUM_SGMT</code> as an integer, so the multiple of the two is the total model time that a job will attempt to run.  It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota.  So to tune for the maximum number of segments in a job, it is usually best to manipulate <code>JOB_SGMT</code>.
+== Determining Output: <code>HISTORY.rc</code> ==
+The contents of the the file <code>HISTORY.rc</code> (in your experiment <code>HOME</code> directory) tell the model what and how to output its state and diagnostic fields.  The default <code>HISTORY.rc</code> provides many fields as is, but you may want to modify it to suit your needs.
+===File format===
+The top of a default <code>HISTORY.rc</code> will look something like this:
+<pre>
+EXPID:  myexp42
+EXPDSC: this_is_my_experiment
+COLLECTIONS: 'geosgcm_prog'
+             'geosgcm_surf'
+             'geosgcm_moist'
+             'geosgcm_turb'
+</pre>
+[....]
+The attribute <code>EXPID</code> must match the name of the experiment <code>HOME</code> directory; this is only an issue if you copy the  <code>HISTORY.rc</code> from a different experiment.  The <code>EXPDSC</code> attribute is used to label the plots.  The <code>COLLECTIONS</code> attribute contains list of strings indicating the output collections to be created.  The content of the individual collections are determined after this list.  Individual collections can be "turned off" by commenting the relevant line with a <code>#</code>.
+The following is an example of a collection specification:
-Each time a segment ends, <code>gcm_run.j</code> submits a post-processing job before starting a new segment or exiting.  The post-processing job moves the model output from the  <code>scratch</code> directory to the respective collection directory under  <code>holding</code>.  Then it determines whether there is a enough output to create a monthly or seasonal mean, and if so, creates them and moves them to the collection directories in the experiment directory, and then tars up the daily output and submits an archiving job.  The archiving job tries to move the tarred daily output, the monthly and seasonal means and any tarred restarts to the user's space in <code>archive</code> filesystem.  The post-processing script also determines (assuming the default settings) whether enough output exists to create plots; if so, a plotting job is submitted to the queue.  The plotting script produces a number of pre-determined plots as <code>.gif</code> files in the <code>plot_CLIM</code> directory in the experiment directory.
+<pre>
+  geosgcm_prog.template:  '%y4%m2%d2_%h2%n2z.nc4',
+  geosgcm_prog.archive:   '%c/Y%y4',
+  geosgcm_prog.format:    'CFIO',
+  geosgcm_prog.frequency:  060000,
+  geosgcm_prog.resolution: 144 91,
+  geosgcm_prog.vscale:     100.0,
+  geosgcm_prog.vunit:     'hPa',
+  geosgcm_prog.vvars:     'log(PLE)' , 'DYN'          ,
+  geosgcm_prog.levels:     1000 975 950 925 900 875 850 825 800 775 750 725 700 650 600 550 500 450 400 350 300 250 200 150 100 70 50 40 30 20 10 7 5 4 3 2 1 0.7 0.5 0.4 0.3 0.2
+.1 0.07 0.05 0.04 0.03 0.02,
+  geosgcm_prog.fields:    'PHIS'     , 'AGCM'         ,
+                          'T'        , 'DYN'          ,
+                          'PS'       , 'DYN'          ,
+                          'ZLE'      , 'DYN'          , 'H'   ,
+                          'OMEGA'    , 'DYN'          ,
+                          'Q'        , 'MOIST'        , 'QV'  ,
+                          ::
+</pre>
+The individual collection attributes are described below, but what users modify the most are the <code>fields</code> attribute.  This determines which exports are saved in the collection.  Each field record is a string with the name of an export from the model followed by a string with the name of the gridded component which exports it, separated by a comma.  The entries with a third column determine the name by which that export in saved in the collection file when the name is different from that of the export.
+===What exports are available?===
+To add export fields to the <code>HISTORY.rc</code> you will need to know what fields the model provides, which gridded component provides them, and their name.  The most straightforward way to do this is to use <code>PRINTSPEC</code>.  The setting for  <code>PRINTSPEC</code> is in the file <code>CAP.rc</code>.  By default the line looks like so:
+ PRINTSPEC: 0  # (0: OFF, 1: IMPORT & EXPORT, 2: IMPORT, 3: EXPORT)
+Setting <code>PRINTSPEC</code> to  3 will make the model send to standard output a list of exports available to <code>HISTORY.rc</code> in the model's current configuration, and then exit without integrating. The list includes each export's gridded component and short name (both necessary to include in <code>HISTORY.rc</code>), long (descriptive) name, units, and number of dimensions.  Note that run-time options can affect the exports available, so see to it that you have those set as you intend.  The other <code>PRINTSPEC</code> values are useful for debugging.
+While you can set  <code>PRINTSPEC</code>, submit <code>qsub gcm_run.j</code>, and get the export list as part of PBS standard output, there are quicker ways of obtaining the list.  One way is to run it as a single column model on a single processor, as explained in [[Fortuna 2.5 Single Column Model]].  Another way is to run it in an existing experiment.  In the <code>scratch</code> directory of an experiment that has already run, change <code>PRINTSPEC</code> in  <code>CAP.rc</code> as above.  Then, in the file <code>AGCM.rc</code>, change the values of <code>NX</code> and <code>NY</code> (near the beginning of the file) to 1.  Then, from an interactive job (one processor will suffice), run the executable <code>GEOSgcm.x</code> in <code>scratch</code>.  You will need to run <code>source src/g5_modules</code> in the model's build tree to set up the environment.  The model executable will simply output the export list to <code>stdout</code>.
+== Special Requirements ==
+=== Perpetual ("Groundhog Day") mode  ===
+GEOS-5 Fortuna 2.5 and later can be run in "perpetual mode", automatically running with the same forcings for a time period delineated as a calendar year, month or day.  The time period desired is set in <code>CAP.rc</code> with the parameters <code>PERPETUAL_YEAR</code>, <code>PERPETUAL_MONTH</code> and <code>PERPETUAL_DAY</code>.  Set all three to run with the forcings for a particular day, and <code>NUM_SGMT</code> to how many times you wish to run it -- the history collection files will be appended with dates starting with the one in <code>cap_restart</code> and generally incrementing for the number of days in <code>NUM_SGMT</code>.
-As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output.  The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code>  (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.
+=== Saving restarts during a segment ===
-Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are importat for understanding how the batch jobs work are JOB_SGMT: NUM_SGMT:
+=== post.rc ===