Fortuna 2.5 User's Guide: Difference between revisions

No edit summary
 
(9 intermediate revisions by the same user not shown)
Line 194: Line 194:


As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output.  The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code>  (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.
As explained above, the contents of the <code>cap_restart</code> file determine the start of the model run in model time, which determines boundary conditions and the times stamps of the output.  The end time may be set in <code>CAP.rc</code> with the property <code>END_DATE</code>  (format ''YYYYMMDD HHMMSS'', with a space), though integration is usually leisurely enough that one can just kill the job or rename the run script <code>gcm_run.j</code> so that it is not resubmitted to the job queue.
===Tuning a run===


Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are important for understanding how the batch jobs work are <code>JOB_SGMT</code>, the length of the segment, and <code>NUM_SGMT</code>, the number of segments that the job tries to run before resubmitting itself and exiting.  <code>JOB_SGMT</code> is in the format of ''YYYYMMDD HHMMSS'' (but usually expressed in days) and <code>NUM_SGMT</code> as an integer, so the multiple of the two is the total model time that a job will attempt to run.  It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota.  So to tune for the maximum number of segments in a job, it is usually best to manipulate <code>JOB_SGMT</code>.
Most of the other properties in <code>CAP.rc</code> are discussed elsewhere, but two that are important for understanding how the batch jobs work are <code>JOB_SGMT</code>, the length of the segment, and <code>NUM_SGMT</code>, the number of segments that the job tries to run before resubmitting itself and exiting.  <code>JOB_SGMT</code> is in the format of ''YYYYMMDD HHMMSS'' (but usually expressed in days) and <code>NUM_SGMT</code> as an integer, so the multiple of the two is the total model time that a job will attempt to run.  It may be tempting to just run one long segment, but much housekeeping is done between segments, such as saving state in the form of restarts and spawning archiving jobs that keep your account from running over disk quota.  So to tune for the maximum number of segments in a job, it is usually best to manipulate <code>JOB_SGMT</code>.




== How to Obtain GEOS-5 and Compile Source Code ==
== Determining Output: <code>HISTORY.rc</code> ==
 
There are two options for obtaining the model source code: from the CVS repository on the NCCS progress server, and from the SVN "public" repository on the trac server.  Since the code on progress is more current, elgible users are strongly encouraged to obtain accounts from NCCS and use the progress repository.
 
=== Using the NCCS progress CVS code repository ===


The following assumes that you know your way around Unix, have successfully logged into your cluster account and have an account on the source code repository with the proper <code>ssh</code> configuration -- see the progress repository quick start: https://progress.nccs.nasa.gov/trac/admin/wiki/CVSACL.  The link requires your NCCS username and password.
The contents of the the file <code>HISTORY.rc</code> (in your experiment <code>HOME</code> directory) tell the model what and how to output its state and diagnostic fields. The default <code>HISTORY.rc</code> provides many fields as is, but you may want to modify it to suit your needs.


The commands below assume that your shell is <code>csh</code>.  Since the scripts to build and run GEOS-5  tend to be written in the same, you shouldn't bother trying to import too much into an alternative shell.  If you prefer a different shell, it is easiest just to open a <code>csh</code> process to build the model and your experiment.
===File format===


Furthermore, model builds should be created in your space under <code>/discover/nobackup</code>, as creating them under your home directory will quickly wipe out your disk quota.
The top of a default <code>HISTORY.rc</code> will look something like this:


Set the following three environment variables:
<pre>
 
EXPIDmyexp42
  setenv CVS_RSH ssh
EXPDSC: this_is_my_experiment
setenv CVSROOT :ext:''USERID''@cvsacl:/cvsroot/esma
 
   
   
where ''USERID'' is, of course, your repository username, which should be the same as your NASA and NCCS username.  Then, issue the command:
COLLECTIONS: 'geosgcm_prog'
 
            'geosgcm_surf'
cvs co -r  Fortuna-2_4 Fortuna
            'geosgcm_moist'
 
            'geosgcm_turb'
This should check out the latest stable version of the model from the repository and create a directory called <code>GEOSagcm</code>. 
</pre>
 
[....]
=== Compiling the Model ===
 
<code>cd</code> into <code>GEOSagcm/src</code> and <code>source</code> the file called <code>g5_modules</code>:
 
source g5_modules
 
This will set up the build environment.  If you then type
 
module list
 
you should see:
 
Currently Loaded Modulefiles:
  1) comp/intel-11.0.083  2) mpi/impi-3.2.2.006    3) lib/mkl-10.0.3.020
 
If this all worked, then type:
 
gmake install
 
This will build the model.  It will take about 40 minutes.  If this works, it should create a directory under <code>GEOSagcm</code> called <code>Linux/bin</code>.  In here you should find the executable: <code>GEOSgcm.x</code> .
 
== Running GEOS-5 ==
 
First of all, to run jobs on the cluster you will need to set up passwordless <code>ssh</code> (which operates within the cluster).  To do so, run the following from your '''discover''' home directory:
 
cd .ssh
ssh-keygen -t dsa
cat id_dsa.pub >>  authorized_keys
 
Similarly, transferring the daily output files (in monthly tarballs) requires passwordless authentication from '''discover''' to '''dirac'''.  While in <code>~/.ssh</code> on '''discover''', run
 
  ssh-keygen -t dsa
 
Then, log into  '''dirac''' and cut and paste the contents of the <code>id_rsa.pub</code> and <code>id_dsa.pub</code> files on '''discover''' into the  <code>~/.ssh/authorized_keys</code> file on  '''dirac'''.  Problems with <code>ssh</code> should be referred to NCCS support.
 
To set the model up to run, in the  <code>GEOSagcm/src/Applications/GEOSgcm_App</code> directory we run:
 
gcm_setup
 
The <code>gcm_setup</code> script asks you a few questions such as an experiment name (with no spaces, called ''EXPID'') and description (spaces ok).  It will also ask you for the model resolution, expecting one of the available lat-lon domain sizes, the dimensions separated by a space.  For your first time out you will probably want to enter <code>144 91</code> (corresponding to ~2 degree resolution).  Towards the end it will ask you for a group ID -- the default is g0602 (GMAO modeling group).  Enter whatever is appropriate, as necessary.  The rest of the questions provide defaults which will be suitable for now, so just press enter for these. 
 
The script produces an experiment directory (''EXPDIR'') in your space as <code>/discover/nobackup/''USERID''/''EXPID''</code>, which contains, among other things, the sub-directories:
 
*<code>post</code>  (containing the incomplete post-processing job script and .rc file)
*<code>archive</code>  (containing an incomplete archiving job script)
*<code>plot</code> (containing an incomplete plotting job script and .rc file)
 
The post-processing script will complete (i.e., add necessary commands to) the archiving and plotting scripts as it runs.  The setup script that you ran also creates an experiment home directory (''HOMEDIR'') as <code>~''USERID''/geos5/''EXPID''</code>  containing the run scripts and GEOS resource (<code>.rc</code>) files.


The attribute <code>EXPID</code> must match the name of the experiment <code>HOME</code> directory; this is only an issue if you copy the  <code>HISTORY.rc</code> from a different experiment.  The <code>EXPDSC</code> attribute is used to label the plots.  The <code>COLLECTIONS</code> attribute contains list of strings indicating the output collections to be created.  The content of the individual collections are determined after this list.  Individual collections can be "turned off" by commenting the relevant line with a <code>#</code>.


The run scripts need some more environment variables -- here are the minimum contents of a <code>.cshrc</code>:
The following is an example of a collection specification:


  umask 022
<pre>
  unlimit
  geosgcm_prog.template: '%y4%m2%d2_%h2%n2z.nc4',
limit stacksize unlimited
  geosgcm_prog.archive:  '%c/Y%y4',
  set arch = `uname`
  geosgcm_prog.format:    'CFIO',
setenv LD_LIBRARY_PATH ${LIBRARY_PATH}:${BASEDIR}/${arch}/lib
  geosgcm_prog.frequency: 060000,
  geosgcm_prog.resolution: 144 91,
  geosgcm_prog.vscale:    100.0,
  geosgcm_prog.vunit:    'hPa',
  geosgcm_prog.vvars:    'log(PLE)' , 'DYN'          ,
  geosgcm_prog.levels:    1000 975 950 925 900 875 850 825 800 775 750 725 700 650 600 550 500 450 400 350 300 250 200 150 100 70 50 40 30 20 10 7 5 4 3 2 1 0.7 0.5 0.4 0.3 0.2
0.1 0.07 0.05 0.04 0.03 0.02,
  geosgcm_prog.fields:    'PHIS'    , 'AGCM'        ,
                          'T'        , 'DYN'          ,
                          'PS'      , 'DYN'          ,
                          'ZLE'      , 'DYN'          , 'H'  ,
                          'OMEGA'    , 'DYN'          ,
                          'Q'        , 'MOIST'        , 'QV' ,
                          ::
</pre>


The <code>umask 022</code> is not strictly necessary, but it will make the various files readable to others, which will facilitate data sharing and user supportYour home directory <code>~''USERID''</code> is also inaccessible to others by default; running <code>chmod 755 ~</code> is helpful.
The individual collection attributes are described below, but what users modify the most are the <code>fields</code> attribute.  This determines which exports are saved in the collection.  Each field record is a string with the name of an export from the model followed by a string with the name of the gridded component which exports it, separated by a commaThe entries with a third column determine the name by which that export in saved in the collection file when the name is different from that of the export.


Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''.  Keep the "originals" handy since if the job stumbles early in the run it might stop after having renamed them.  The model expects restart filenames to end in "rst" but produces them with the date and time appended, so you may have to rename them.  The <code>cap_restart</code> file is simply one line of text with the date of the restart files in the format YYYYMMDD<space>HHMMSS.  The boundary conditions/forcings are provided by symbolic links created by the run script. 
===What exports are available?===


If you need an arbitrary set of restarts, you can copy them from <code>/archive/u/aeichman/restarts/Fortuna-2_4/</code>, where they are indexed by date and resolution.
To add export fields to the <code>HISTORY.rc</code> you will need to know what fields the model provides, which gridded component provides them, and their name.  The most straightforward way to do this is to use <code>PRINTSPEC</code>.  The setting for  <code>PRINTSPEC</code> is in the file <code>CAP.rc</code>. By default the line looks like so:


PRINTSPEC: 0  # (0: OFF, 1: IMPORT & EXPORT, 2: IMPORT, 3: EXPORT)


The script you submit, <code>gcm_run.j</code>, is in ''HOMEDIR''. It should be ready to go as is.  The parameter END_DATE in <code>CAP.rc</code> (previously in <code>gcm_run.j</code>) can be set to the date you want the run to stop.  An alternative way to stop the run is by commenting out the line <code> if ( $capdate < $enddate ) qsub $HOMDIR/gcm_run.j</code> at the end of the script, which will prevent the script from being resubmitted, or rename the script fileYou may eventually want to tune parameters in the <code>CAP.rc</code> file JOB_SGMT (the number of days per segment, the interval between saving restarts) and NUM_SGMT (the number of segments attempted in a job) to maximize your run time.
Setting <code>PRINTSPEC</code> to 3 will make the model send to standard output a list of exports available to <code>HISTORY.rc</code> in the model's current configuration, and then exit without integrating. The list includes each export's gridded component and short name (both necessary to include in <code>HISTORY.rc</code>), long (descriptive) name, units, and number of dimensions. Note that run-time options can affect the exports available, so see to it that you have those set as you intendThe other <code>PRINTSPEC</code> values are useful for debugging.


Submit the job with <code>qsub gcm_run.j</code>.  You can keep track of it with <code>qstat</code> or <code>qstat | grep ''USERID''</code>, or follow stdout with <code>tail -f /discover/pbs_spool/''JOBID''.OU</code>, ''JOBID'' being returned by <code>qsub</code> and displayed with <code>qstat</code>.  Jobs can be killed with <code>qdel ''JOBID''</code>.  The standard out and standard error will be delivered as files to the working directory at the time you submitted the job.
While you can set  <code>PRINTSPEC</code>, submit <code>qsub gcm_run.j</code>, and get the export list as part of PBS standard output, there are quicker ways of obtaining the list.  One way is to run it as a single column model on a single processor, as explained in [[Fortuna 2.5 Single Column Model]]Another way is to run it in an existing experiment.  In the <code>scratch</code> directory of an experiment that has already run, change <code>PRINTSPEC</code> in  <code>CAP.rc</code> as above.  Then, in the file <code>AGCM.rc</code>, change the values of <code>NX</code> and <code>NY</code> (near the beginning of the file) to 1Then, from an interactive job (one processor will suffice), run the executable <code>GEOSgcm.x</code> in <code>scratch</code>.  You will need to run <code>source src/g5_modules</code> in the model's build tree to set up the environment.  The model executable will simply output the export list to <code>stdout</code>.


== Output and Plots ==
== Special Requirements ==


During a normal run, the <code>gcm_run.j</code> script will run the model for the segment length (current default is 8 days). The model creates output files (with an <code>nc4</code> extension), also called collections (of output variables), in  <code>''EXPDIR''/scratch</code> directory.  After each segment, the script moves the output to the <code>''EXPDIR''/holding</code> and spawns a post-processing batch job which partitions and moves the output files  within the <code>holding</code> directory to their own distinct collection directory, which is again partitioned into the appropriate year and month.  The  post processing script then checks to
=== Perpetual ("Groundhog Day") mode ===
see if  a full month of data is present.  If not, the post-processing job ends.  If there is a full month, the script will then run the time-averaging executable to produce a monthly mean file in <code>''EXPDIR''/geos_gcm_*</code>.  The post-processing script then spawns a new batch job which will archive the data onto the mass-storage drives (<code>/archive/u/''USERID''/GEOS5.0/''EXPID''</code>).


If a monthly average file was made, the post-processing script will also
GEOS-5 Fortuna 2.5 and later can be run in "perpetual mode", automatically running with the same forcings for a time period delineated as a calendar year, month or dayThe time period desired is set in <code>CAP.rc</code> with the parameters <code>PERPETUAL_YEAR</code>, <code>PERPETUAL_MONTH</code> and <code>PERPETUAL_DAY</code>.  Set all three to run with the forcings for a particular day, and <code>NUM_SGMT</code> to how many times you wish to run it -- the history collection files will be appended with dates starting with the one in <code>cap_restart</code> and generally incrementing for the number of days in <code>NUM_SGMT</code>.
check to see if it should spawn a plot jobCurrently, our criteria for
plotting are:  1) if the month created was February or August,  AND
2) there are at least 3 monthly average files, then a plotting job for
the seasons DJF or JJA will be issued.  The plots are created as gifs in <code>''EXPDIR''/plots_CLIM</code>.


The post-processing script can be found in:
=== Saving restarts during a segment ===
<code>GEOSagcm/src/GMAO_Shared/GEOS_Util/post/gcmpost.script</code>.  The <code>nc4</code> output files can be opened and plotted with <code>gradsnc4</code> -- see http://www.iges.org/grads/gadoc/tutorial.html for a tutorial, but use <code>sdfopen</code> instead of <code>open</code>.


The contents of the output files (including which variables get saved) may be configured in the  <code>''HOMEDIR''/HISTORY.rc</code> -- a good description of this file may be found at http://modelingguru.nasa.gov/clearspace/docs/DOC-1190 .
=== post.rc ===