GEOS-5 Checkout and Build Instructions (Fortuna)

From GEOS-5
Revision as of 12:44, 2 September 2010 by Peter.colarco (talk | contribs)
Jump to navigation Jump to search

These instructions presume checking out the PIESA group tag: piesa-Fortuna-2_1_p2-m16.

How to Check Out and Build the Code

Find a place to store and build the code

The GEOS-5 (AGCM) source code checks out at about 200 MB of space. Once compiled, the complete package is about 1.3 GB. Your home space on discover may not be sufficient for checking out and building the code. You should consider either 1) requesting a larger quota in your home space (call the tag x6-9120 and ask, telling them you are doing GEOS-5 development work) or 2) building in your (larger) nobackup space (recommended). But consider, nobackup is not backed up. So be careful...

One strategy I like is to check the code out to my nobackup space, but then make symlink from him home space back to that. For example, if I have my code stored at $NOBACKUP/GEOSagcm, I would make a symlink in my home space to point to that like:

% ln -s $NOBACKUP/GEOSagcm GEOSagcm

Check Out the Code

Let's get ourselves ready to check out the code. We'll be using the cvs command to check out the code. The basic syntax is:

% cvs -d $CVSROOT checkout -d DIRECTORY -r TAGNAME MODULENAME

Here, $CVSROOT specifies the CVS repository we'll be getting the code from, DIRECTORY is the name of the directory you would like to create to hold the code, MODULENAME is the particular module (set of code) we'll be checking out, and TAGNAME is a particular version of that module. Let's fill in the blanks:

% cvs -d :ext:pcolarco@progressdirect:/cvsroot/esma co -d GEOSagcm -r piesa-Fortuna-2_1_p2-m16 Fortuna

So our module is Fortuna and the tag is piesa-Fortuna-2_1_p2-m16. The code will check-out to a newly created directory call GEOSagcm. Note that I substituted the shortcut co for checkout in the above command.

The above command is generally valid. You ought to be able to execute it and checkout some code. If you don't have your ssh keys setup on progress then you should be prompted for your progress password. The assumption here is that your username on progress is the same as on the machine you are checking the code out on.

Here's a short cut. So that you don't have to type in the -d :ext:pcolarco@progressdirect:/cvsroot/esma business all the time, you can add the following lines to your, e.g., .cshrc file:

setenv CVSROOT ':ext:pcolarco@progressdirect:/cvsroot/esma'
setenv CVS_RSH ssh

Modify as appropriate to put in your username in or if you use a different shell (i.e., put the analog of these lines into your .bashrc file or whatever). Or if you are on a different machine than discover, the path to the progress repository may be somewhat different.

If you set that up, you should be able now to type in:

% cvs co -d GEOSagcm -r piesa-Fortuna-2_1_p2-m16 Fortuna 

and the code will check out.

Build the Code

Now you've checked out the code. Your should have a directory called GEOSagcm in front of you. You're almost ready to build the code at this point.

The first thing to do is to set up your shell environment. First, we need to define the environment variable ESMADIR, e.g.,

% setenv ESMADIR $NOBACKUP/GEOSagcm

assuming you put the source code in your $NOBACKUP directory.

Then:

% cd GEOSagcm/src
% source g5_modules

This loads the relevant "modules" for this version of the model. The "modules" specify the version of compiler, MPI libraries, and scientific libraries assumed by the code. This step also sets the environment variable BASEDIR, which specifies the version of the so-called "Baselibs" the code requires. The "Baselibs" contain, e.g., the netcdf and ESMF libraries.

By the way, I have an alias in my .cshrc file that actually accomplishes the above tasks. It looks like this:

% alias g5 'setenv ESMADIR $NOBACKUP/GEOSagcm;cd $ESMADIR/src;source g5_modules'

Now, assuming you're in the source directory, you can build the issuing the following command:

% gmake RCDIR=PIESA install

Note: the "RCDIR=PIESA" setting in the gmake command says you will use the special PIESA resource files for GOCART/AGCM/etc. Omit to use default resource files.

If you do that, go away and take a coffee break. A long one. This may take an hour or more to build. There are a couple of ways to speed this process up. One way is to build the code without optimization:

% gmake RCDIR=PIESA install FOPT=-g

The code builds faster in this instance, but be warned that without optimization any generated code will run very slowly.

A better way is to do a parallel build. To do this, start an interactive queue (on discover):

% qsub -I -W group_list=g0604 -N g5Debug -l select=2:ncpus=4,walltime=03:00:00 -S /bin/tcsh -V -j eo

Note that the string following "group_list=" is your group-id code. It's the project that gets charged for the computer time you use. If you're not on "g0604" that's okay, the queue system will let you know and it won't start your job. To find out which group you belong to, issue the following command:

% getsponsor

and you'll get a table of sponsor codes available to you. Enter one of those codes as the group_list string and try again.

Wait, what have we done here? We've started an interactive queue (interactive in the sense that you have a command line) where we've now go 8 cpus allocated to us (and us alone!) for the next 3 hours. We can use all those 8 cpus to speed up our build as follows:

% gmake --jobs=8 RCDIR=PIESA pinstall

The syntax here is that "--jobs=" specifies the number of cpus to use (up to the 8 we've requested in our queue) and "pinstall" means to do a parallel install. Don't worry, the result should be the same as "gmake install" above but take a fraction of the time.

What if something goes wrong? Sometimes the build just doesn't go right. It's useful to save the output that scrolls by on the screen to a file so you can analyze it later. Modify any of the build examples above as follows to capture the text to a file called "make.log":

% gmake --jobs=8 RCDIR=PIESA pinstall |& tee make.log

and now you have a record of how the build progressed. When the build completes (successfully or otherwise) you can analyze the build results by issuing the following command:

% Config/gmh.pl -v make.log

and you'll be given a list of what compiled and didn't, which will hopefull allow you to go in and find any problems. If there are problems indicated the first pass through, try the build step again and see if they are cleared up.

If all goes well, you should have a brand-new build of GEOS-5. Step back up out of the src directory you should see the following sub-directories:

Config
CVS
Linux
src

In the Linux directory you'll find:

bin
Config
doc
etc
include
lib

The executables are in the bin directory. The resource files are in the etc directory.

In this example, the directory GEOSagcm is the root directory everything ends up under. You can specify another location by setting the environment variable ESMADIR to some other location and installing again.

How to Setup and Run an Experiment

Now that you've built the code, let's try to run it. In the exercise that follows, we will create a fresh experiment.

In what follows I will assume we are working on the NCCS computer discover.

Before we get going, let's make some light edits to your .cshrc file. First, near the top of your .cshrc file add the word:

unlimit

We're not sure what this means, but Arlindo says it is important.

Let's make sure that your binaries from your compiled GEOS-5 code are in your path. Include the following line somewhere in your .cshrc file:

setenv PATH .:$NOBACKUP/GEOSagcm/Linux/bin:$PATH

where obviously you replace the particular path to the binaries with your path. Note what this implies: it won't be a good idea to move or clean this directory while the model is running!


Create the Experiment Directories

From your model build src directory, go into Applications/GEOSgcm_App. Then run gcm_setup.

% cd Applications/GEOSgcm_App
% ./gcm_setup

Your will be prompted to answer some questions:

experiment ID
1-line description of the experiment
model resolution (IM JM)
aero provider (PCHEM or GOCART)
HOME directory
EXP directory
group ID

Upon completion, you will have two directories, the HOMEDIR which is in your home space (something like ~/geos5/EXPID) and the EXPDIR which is in your nobackup space.

Setup the Experiment HOMEDIR

Go to the HOMEDIR. You need to edit a couple of files to get the experiment to run the way you want:

AGCM.rc

This file controls the model run characteristics. For example, my AGCM.rc file has the following lines characteristic (to run GOCART with climatological aerosol forcing):

GOCART_INTERNAL_RESTART_FILE:           gocart_internal_rst 
GOCART_INTERNAL_RESTART_TYPE:           binary
GOCART_INTERNAL_CHECKPOINT_FILE:        gocart_internal_checkpoint
GOCART_INTERNAL_CHECKPOINT_TYPE:        binary
AEROCLIM:    ExtData/AeroCom/L72/aero_clm/gfedv2.aero.eta.%y4%m2clm.nc
AEROCLIMDEL: ExtData/AeroCom/L72/aero_clm/gfedv2.del_aero.eta.%y4%m2clm.nc
AEROCLIMYEAR: 2002
DIURNAL_BIOMASS_BURNING: no
RATS_PROVIDER: PCHEM   # options: PCHEM, GMICHEM, STRATCHEM (Radiatively active tracers)
AERO_PROVIDER: PCHEM   # options: PCHEM, GOCART             (Radiatively active aerosols)

To run with interactive aerosol forcing, modify the appropriate lines above to look like:

#AEROCLIM:    ExtData/AeroCom/L72/aero_clm/gfedv2.aero.eta.%y4%m2clm.nc
#AEROCLIMDEL: ExtData/AeroCom/L72/aero_clm/gfedv2.del_aero.eta.%y4%m2clm.nc
#AEROCLIMYEAR: 2002
AERO_PROVIDER: GOCART   # options: PCHEM, GOCART             (Radiatively active aerosols)

HISTORY.rc

This file controls the output streams from the model. Mine has the following collections:

COLLECTIONS: 'geosgcm_prog'
             'geosgcm_surf'
             'geosgcm_moist'
             'geosgcm_turb'
             'geosgcm_gwd'
             'geosgcm_tend'
             'geosgcm_bud'
             'tavg2d_aer_x'
             'inst3d_aer_v'
             ::

CAP.rc

This file controls the timing of the model run. For example, you specify the END_DATE, the number of days to run per segment (JOB_SGMT) and the number of segments (NUM_SGMT). If you modify nothing else, the model would run until it reached END_DATE or had run for NUM_SGMT segments.

gcm_run.j

This is the model run script. You might not need to edit anything, but check in here to see if we are pointing to the right emission files. Mine has the following line:

setenv CHMDIR   /share/dasilva/fvInput

which points to the PIESA chemistry emission files presumed in this run. You might have to modify this line on a different system or to use different emission inventories.

cap_restart

This file doesn't exist here by default, so create it with your favorite editor. It will have the model start date (ignore BEG_DATE in CAP.rc; if this file exists it supersedes that). It should just specify YYYYMMDD HHMMSS start time of model run. For example:

19991231 210000

An Optional Modification

For some applications it is desirable to have the model stop running to generate restarts on fixed dates and then start up again. (Each instance of the model running for some number of days and then stopping and generating restarts is called a segment.) For example, suppose we want restarts at the beginning and middle of the month. I've added this functionality, but you need to make a few edits. What works is to ensure your model runs to the middle of the first month (in practice, the 17th of the month), stopping it, and then from that point to the 1st of the next month. Here's how: edit CAP.rc so that

END_DATE:     20070101 210000
JOB_SGMT:     00000017 000000
NUM_SGMT:     12

substituting the appropriate YYYYMM in END_DATE for the overall time you want your experiment to end.

Edit gcm_run.j so the following environment variables are set like this

set STOP_MID_MONTH = 1
set DFLT_JOB_SGMT = 16

(By default, STOP_MID_MONTH = 0, which bypasses the logic here).

Here's what is happening: overall model control is handled by CAP.rc. We specify the model finish time with END_DATE in CAP.rc. As set up, JOB_SGMT is the number of days to run the model in a single segment, and NUM_SGMT is the number of segments to run in a single queue submission. With STOP_MID_MONTH set to 1 what happens is that the JOB_SGMT environment variable is tweaked for each half-month of the model run. That is, the first segment runs the initial number of days (in this case, 17) and the next segment runs for however many days are needed to reach the 1st of the next month. (In this example, I'm presuming we're starting from an initial date of 19991231, which requires us to run for 17 days to cause the model to stop on 20000117, which is the desired monthly mid-point.) What's really happening is that CAP.rc is being edited after each segment to specify the (hopefully) correct number of days to run the next segment. Here, NUM_SGMT is set to 12, which will run the model for 6 months in a single queue submission (1/2 month per segment). This seems to fit conservatively for our resolution in a single 8-hour wallclock time request.

Running on Pleiades

This tag has successfully checked out, built, run, and archived results on NAS machine Pleiades. There seem to be only two modifications to what is described to make, both to the gcm_run.j script. First, the PIESA style emissions are located differently on Pleiades, so:

setenv CHMDIR   /nobackup/pcolarco/fvInput

Setup the Experiment EXPDIR

Now go to your EXPDIR.

You might want to edit the resource files in the RC sub-directory (e.g., GEOS_ChemGridComp.rc to turn on GOCART and Chem_Registry.rc to turn components on/off).

You also need some restarts to run the model. For Fortuna at "b" resolution you can copy a set (valid nominally at 19991231) from /discover/nobackup/project/gmao/iesa/aerosol/Data/restarts/144x91/19991231 (on pleiades: /nobackup/pcolarco/restart0/piesa/b72). You can just copy these files into your experiment directory, but what I like to is make a "restart0" directory in my EXPDIR and copy the files there, and then copy back up to the EXPDIR.

Run the Simulation

At this point we're ready to take a crack at running the model. Go into your experiment HOMEDIR. You can start the model job by issuing:

% qsub gcm_run.j

And check the progress by issuing:

% qstat | grep YOUR_USERNAME