Ganymed 1.0 Quick Start: Difference between revisions

 
(31 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page describes the minimum steps required to build and run GEOS-5 Ganymed 1.0 on NCCS discover and NAS pleiades.  You should successfully complete the steps in these instructions before doing anything more complicated.   
This page describes the minimum steps required to build and run GEOS-5 Ganymed 1.0 on NCCS discover and NAS pleiades.  '''You should successfully complete the steps in these instructions before doing anything more complicated. Also, it is helpful to read this page in its entirety before starting.'''  


'''Back to [[GEOS-5 Documentation for Ganymed 1.0]]'''
'''Back to [[GEOS-5 Documentation for Ganymed 1.0]]'''
Line 22: Line 22:
where ''USERID'' is, of course, your repository username, which should be the same as your NASA and NCCS username.  Then, issue the command:
where ''USERID'' is, of course, your repository username, which should be the same as your NASA and NCCS username.  Then, issue the command:


  cvs co -r Ganymed-1_0_BETA5 Ganymed-1_0_BETA5 Ganymed
  cvs co -r Ganymed-1_0_p7 -d Ganymed-1_0_p7 Ganymed




This should check out the latest stable version of the model from the repository and create a directory called <code>GEOSagcm</code>.
This should check out the latest stable version of the model from the repository and create a directory called <code>Ganymed-1_0</code>.


=== Compiling the Model ===
=== Compiling the Model ===


<code>cd</code> into <code>GEOSagcm/src</code> and <code>source</code> the file called <code>g5_modules</code>:
<code>cd</code> into <code>Ganymed-1_0/src</code> and <code>source</code> the file called <code>g5_modules</code>:


  source g5_modules
  source g5_modules
Line 40: Line 40:


  Currently Loaded Modulefiles:
  Currently Loaded Modulefiles:
  1) comp/intel-11.0.083       3) lib/mkl-10.0.3.020         5) other/SIVO-PyD/spd_1.0.0
  1) comp/intel-11.0.083                     3) lib/mkl-10.0.3.020
  2) mpi/impi-3.2.2.006         4) other/comp/gcc-4.5
  2) mpi/impi-3.2.2.006                       4) other/SIVO-PyD/spd_1.6.0_gcc-4.3.4-sp1
 


If this all worked, then type:
If this all worked, then type:
Line 47: Line 48:
  gmake install
  gmake install


This will build the model.  It will take about 40 minutes.  If this works, it should create a directory under <code>GEOSagcm</code> called <code>Linux/bin</code>.  In here you should find the executable: <code>GEOSgcm.x</code> .
This will build the model.  It will take about 40 minutes.  If this works, it should create a directory under <code>Ganymed-1_0</code> called <code>Linux/bin</code>.  In here you should find the executable: <code>GEOSgcm.x</code> .


== Setting up a Run ==
== Setting up a Run ==


First of all, to run jobs on the cluster you will need to set up passwordless <code>ssh</code> (which operates within the cluster).  To do so, run the following from your '''discover''' home directory:
First of all, to run jobs on the cluster you will need to set up passwordless <code>ssh</code> (which operates within the cluster, between the nodes running the job).  To do so, run the following from your '''discover''' home directory:


  cd .ssh
  cd .ssh
Line 63: Line 64:
Then, log into  '''dirac''' and cut and paste the contents of the <code>id_rsa.pub</code> and <code>id_dsa.pub</code> files on '''discover''' into the  <code>~/.ssh/authorized_keys</code> file on  '''dirac'''.  Problems with <code>ssh</code> should be referred to NCCS support.
Then, log into  '''dirac''' and cut and paste the contents of the <code>id_rsa.pub</code> and <code>id_dsa.pub</code> files on '''discover''' into the  <code>~/.ssh/authorized_keys</code> file on  '''dirac'''.  Problems with <code>ssh</code> should be referred to NCCS support.


To set the model up to run, in the  <code>GEOSagcm/src/Applications/GEOSgcm_App</code> directory we run:
To set the model up to run, cd to <code>Ganymed-1_0/src/Applications/GEOSgcm_App</code> and run:


  gcm_setup
  ./gcm_setup


The <code>gcm_setup</code> script asks you to provide an experiment name :
The <code>gcm_setup</code> script asks you to provide an experiment name :
Line 88: Line 89:
  Enter the Dynamical Core to use:  FV (Lat-Lon), FV3 (Cubed-Sphere)
  Enter the Dynamical Core to use:  FV (Lat-Lon), FV3 (Cubed-Sphere)


Enter <code>FV</code> for lat-lon.  On the next six questions, hit enter to accept the default:
Enter <code>FV</code> for lat-lon.  On the next seven questions, hit enter to accept the default:
   
   
  Do you wish to run the COUPLED Ocean/Sea-Ice Model? (Default: NO or FALSE)
  Do you wish to run the COUPLED Ocean/Sea-Ice Model? (Default: NO or FALSE)
Enter the Data_Ocean Horizontal Resolution code: o1 (1  -deg,  360x180, (e.g. Reynolds) Default)
                                                o8 (1/8-deg, 2880x1440, (e.g. OSTIA))


  Do you wish to run GOCART? (Default: NO or FALSE)
  Do you wish to run GOCART? (Default: NO or FALSE)
Line 121: Line 125:




The script produces an experiment directory (''EXPDIR'') in your space as <code>/discover/nobackup/''USERID''/''EXPID''</code>, which contains, among other things, the sub-directories:
The script produces an experiment directory (''EXPDIR'') in your space as <code>/discover/nobackup/''USERID''/''EXPID''</code>, which contains the files and sub-directories:


*<code>post</code> (containing the script and .rc file for post processing model output)
*<code>AGCM.rc</code> -- resource file with specifications of boundary conditions, initial conditions, parameters, etc.
*<code>archive</code> (containing an incomplete archiving job script)
*<code>archive/</code> -- contains job script for archiving output
*<code>plot</code> (containing plotting job script template and .rc file)
*<code>CAP.rc</code> -- resource file with run job parameters
*<code>ExtData.rc</code> -- sample resource file for external data, not used
*<code>forecasts/</code> -- contains scripts used for data assimilation mode
*<code>gcm_run.j</code> -- run script
*<code>GEOSgcm.x</code> -- model executable
*<code>HISTORY.rc</code> -- resource file specifying the fields in the model that are output as data
*<code>plot/</code> -- contains plotting job script template and .rc file
*<code>post/</code> -- contains the script template and .rc file for post-processing model output
*<code>RC/</code> -- contains resource files for various components of the model
*<code>regress/</code> -- contains scripts for doing regression testing of model


The post-processing script will complete (i.e., add necessary commands to) the archiving and plotting scripts as it runs.  The setup script that you ran also creates an experiment home directory (''HOMEDIR'') as <code>~''USERID''/geos5/''EXPID''</code>  containing the run scripts and GEOS resource (<code>.rc</code>) files.


The post-processing script will generate the archiving and plotting scripts as it runs.  The setup script that you ran also creates an experiment home directory (''HOMEDIR'') as <code>~''USERID''/geos5/''EXPID''</code>  containing the run scripts and GEOS resource (<code>.rc</code>) files.  (You can also specify the experiment home directory to be the same as the experiment directory.)


The run scripts need some more environment variables -- here are the minimum contents of a <code>.cshrc</code>:
== Running GEOS-5 ==
 
Before running the model, there is some more setup to be completed.  The run scripts need some environment variables set in <code>~/.cshrc</code> (regardless of which login shell you use -- the GEOS-5 scripts use <code>csh</code>).  Here are the minimum contents of a <code>.cshrc</code>:


  umask 022
  umask 022
Line 140: Line 155:
The <code>umask 022</code> is not strictly necessary, but it will make the various files readable to others, which will facilitate data sharing and user support.  Your home directory <code>~''USERID''</code> is also inaccessible to others by default; running <code>chmod 755 ~</code> is helpful.
The <code>umask 022</code> is not strictly necessary, but it will make the various files readable to others, which will facilitate data sharing and user support.  Your home directory <code>~''USERID''</code> is also inaccessible to others by default; running <code>chmod 755 ~</code> is helpful.


== Running GEOS-5 ==
Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''.  Keep the "originals" handy since if the job stumbles early in the run it might stop after having renamed them.  The model expects restart filenames to end in "rst" but produces them with the date and time appended, so you may have to rename them back to ending in "rst".  The <code>cap_restart</code> file is sometimes provided with a set of restarts, but if not you can create it: it is simply one line of text with the date of the restart files in the format ''<code>YYYYMMDD HHMMSS</code>'' (with a space).  The boundary conditions/forcings are provided by symbolic links created by the run script.  If you need an arbitrary set of restarts, you can copy them from <code>/archive/u/aeichman/restarts/Ganymed-1_0/</code>, where they are indexed by resolution and date.  If you are unfamiliar with the way that the <code>/archive</code> filesystem works, keep in mind that a <code>cp</code> from there might appear to stall while the tape is loaded -- see the NCCS documentation for details.
 


Copy the restart (initial condition) files and associated <code>cap_restart</code> into ''EXPDIR''.  Keep the "originals" handy since if the job stumbles early in the run it might stop after having renamed them.  The model expects restart filenames to end in "rst" but produces them with the date and time appended, so you may have to rename them back to ending in "rst".  The <code>cap_restart</code> file is often provided with a set of restarts, but if not you can create it: is simply one line of text with the date of the restart files in the format ''<code>YYYYMMDD HHMMSS</code>'' (with a space).  The boundary conditions/forcings are provided by symbolic links created by the run script. 
The script you submit, <code>gcm_run.j</code>, is in ''HOMEDIR''.  It should be ready to go as is.  The parameter END_DATE in <code>CAP.rc</code> can be set to the date you want the run to stop.  An alternative way to stop the run is by commenting out the line <code> if ( $capdate < $enddate ) qsub $HOMDIR/gcm_run.j</code> at the end of the script, which will prevent the script from being resubmitted, or rename the script file, or kill the job (described below).  
 
If you need an arbitrary set of restarts, you can copy them from <code>/archive/u/aeichman/restarts/Fortuna-2_5/</code>, where they are indexed by date and resolution.
 
 
The script you submit, <code>gcm_run.j</code>, is in ''HOMEDIR''.  It should be ready to go as is.  The parameter END_DATE in <code>CAP.rc</code> (previously in <code>gcm_run.j</code>) can be set to the date you want the run to stop.  An alternative way to stop the run is by commenting out the line <code> if ( $capdate < $enddate ) qsub $HOMDIR/gcm_run.j</code> at the end of the script, which will prevent the script from being resubmitted, or rename the script file.  You may eventually want to tune parameters in the <code>CAP.rc</code> file JOB_SGMT (the number of days per segment, the interval between saving restarts) and NUM_SGMT (the number of segments attempted in a job) to maximize your run time.


Submit the job with <code>qsub gcm_run.j</code>.  You can keep track of it with <code>qstat</code> or <code>qstat | grep ''USERID''</code>, or follow stdout with <code>tail -f /discover/pbs_spool/''JOBID''.OU</code>, ''JOBID'' being returned by <code>qsub</code> and displayed with <code>qstat</code>.  Jobs can be killed with <code>qdel ''JOBID''</code>.  The standard out and standard error will be delivered as files to the working directory at the time you submitted the job.
Submit the job with <code>qsub gcm_run.j</code>.  You can keep track of it with <code>qstat</code> or <code>qstat | grep ''USERID''</code>, or follow stdout with <code>tail -f /discover/pbs_spool/''JOBID''.OU</code>, ''JOBID'' being returned by <code>qsub</code> and displayed with <code>qstat</code>.  Jobs can be killed with <code>qdel ''JOBID''</code>.  The standard out and standard error will be delivered as files to the working directory at the time you submitted the job.
Line 154: Line 163:
== Output and Plots ==
== Output and Plots ==


During a normal run, the <code>gcm_run.j</code> script will run the model for the segment length (current default is 10 days).  The model creates output files (with an <code>nc4</code> extension), also called collections (of output variables), in  <code>''EXPDIR''/scratch</code> directory.  After each segment, the script moves the output to the <code>''EXPDIR''/holding</code> and spawns a post-processing batch job which partitions and moves the output files  within the <code>holding</code> directory to their own distinct collection directory, which is again partitioned into the appropriate year and month.  The  post processing script then checks to
During a normal run, the <code>gcm_run.j</code> script will run the model for the segment length (current default is 15 days in model time).  The model creates output files (with an <code>nc4</code> extension), also called collections (of output variables), in  <code>''EXPDIR''/scratch</code> directory.  After each segment, the script moves the output to the <code>''EXPDIR''/holding</code> and spawns a post-processing batch job which partitions and moves the output files  within the <code>holding</code> directory to their own distinct collection directory, which is again partitioned into the appropriate year and month.  The  post processing script then checks to
see if  a full month of data is present.  If not, the post-processing job ends.  If there is a full month, the script will then run the time-averaging executable to produce a monthly mean file in <code>''EXPDIR''/geosgcm_*</code>.  The post-processing script then spawns a new batch job which will archive the data onto the mass-storage drives (<code>/archive/u/''USERID''/GEOS5.0/''EXPID''</code>).
see if  a full month of data is present.  If not, the post-processing job ends.  If there is a full month, the script will then run the time-averaging executable to produce a monthly mean file in <code>''EXPDIR''/geosgcm_*</code>.  The post-processing script then spawns a new batch job which will archive the data onto the mass-storage drives (<code>/archive/u/''USERID''/GEOS5.0/''EXPID''</code>).


Line 168: Line 177:
The contents of the output files (including which variables get saved) may be configured in the  <code>''HOMEDIR''/HISTORY.rc</code> -- a good description of this file may be found at http://modelingguru.nasa.gov/clearspace/docs/DOC-1190 .
The contents of the output files (including which variables get saved) may be configured in the  <code>''HOMEDIR''/HISTORY.rc</code> -- a good description of this file may be found at http://modelingguru.nasa.gov/clearspace/docs/DOC-1190 .


'''Back to [[GEOS-5 Documentation for Fortuna 2.5]]'''
'''Back to [[GEOS-5 Documentation for Ganymed 1.0]]'''