GEOS-5 Checkout and Build Instructions (Fortuna): Difference between revisions
Line 226: | Line 226: | ||
This tag has successfully checked out, built, run, and archived results on NAS machine Pleiades. No modifications to scripts are needed, as the appropriate choices for that machine are handled in gcm_setup. | This tag has successfully checked out, built, run, and archived results on NAS machine Pleiades. No modifications to scripts are needed, as the appropriate choices for that machine are handled in gcm_setup. | ||
=== Setup the | === Setup the EXP directory === | ||
Now go to your | Now go to your EXP directory. | ||
You want to specify the model start time at this point. Create a file called ''cap_restart". The contents of the file should read: | You want to specify the model start time at this point. Create a file called ''cap_restart". The contents of the file should read: |
Revision as of 07:03, 31 March 2011
These instructions presume checking out the PIESA group tag: AeroChem-Fortuna-2_4-b3.
How to Check Out and Build the Code
Find a place to store and build the code
The GEOS-5 (AGCM) source code checks out at about 55 MB of space. Once compiled, the complete package is about 1.3 GB. Your home space on discover may not be sufficient for checking out and building the code. You should consider either 1) requesting a larger quota in your home space (call the tag x6-9120 and ask, telling them you are doing GEOS-5 development work) or 2) building in your (larger) nobackup space (recommended). But consider, nobackup is not backed up. So be careful...
One strategy I like is to check the code out to my nobackup space, but then make symlink from him home space back to that. For example, if I have my code stored at $NOBACKUP/GEOSagcm, I would make a symlink in my home space to point to that like:
% ln -s $NOBACKUP/GEOSagcm GEOSagcm
Check Out the Code
Let's get ourselves ready to check out the code. We'll be using the cvs command to check out the code. The basic syntax is:
% cvs -d $CVSROOT checkout -d DIRECTORY -r TAGNAME MODULENAME
Here, $CVSROOT specifies the CVS repository we'll be getting the code from, DIRECTORY is the name of the directory you would like to create to hold the code, MODULENAME is the particular module (set of code) we'll be checking out, and TAGNAME is a particular version of that module. Let's fill in the blanks:
% cvs -d :ext:pcolarco@cvsacldirect:/cvsroot/esma co -d GEOSagcm -r AeroChem-Fortuna-2_4-b3 Fortuna
So our module is Fortuna and the tag is AeroChem-Fortuna-2_4-b3. The code will check-out to a newly created directory call GEOSagcm. Note that I substituted the shortcut co for checkout in the above command.
The above command is generally valid. You ought to be able to execute it and checkout some code. If you don't have your ssh keys setup on cvsacl then you should be prompted for your cvsacl password. The assumption here is that your username on cvsacl is the same as on the machine you are checking the code out on.
Here's a short cut. So that you don't have to type in the -d :ext:pcolarco@cvsacldirect:/cvsroot/esma business all the time, you can add the following lines to your, e.g., .cshrc file:
setenv CVSROOT ':ext:pcolarco@cvsacldirect:/cvsroot/esma' setenv CVS_RSH ssh
Modify as appropriate to put in your username in or if you use a different shell (i.e., put the analog of these lines into your .bashrc file or whatever). Or if you are on a different machine than discover, the path to the cvsacl repository may be somewhat different.
If you set that up, you should be able now to type in:
% cvs co -d GEOSagcm -r AeroChem-Fortuna-2_4-b3 Fortuna
and the code will check out.
Build the Code
Now you've checked out the code. Your should have a directory called GEOSagcm in front of you. You're almost ready to build the code at this point.
The first thing to do is to set up your shell environment. First, we need to define the environment variable ESMADIR, e.g.,
% setenv ESMADIR $NOBACKUP/GEOSagcm
assuming you put the source code in your $NOBACKUP directory.
Then:
% cd GEOSagcm/src % source g5_modules
This loads the relevant "modules" for this version of the model. The "modules" specify the version of compiler, MPI libraries, and scientific libraries assumed by the code. This step also sets the environment variable BASEDIR, which specifies the version of the so-called "Baselibs" the code requires. The "Baselibs" contain, e.g., the netcdf and ESMF libraries.
By the way, I have an alias in my .cshrc file that actually accomplishes the above tasks. It looks like this:
% alias g5 'setenv ESMADIR $NOBACKUP/GEOSagcm;cd $ESMADIR/src;source g5_modules'
Now, assuming you're in the source directory, you can build the issuing the following command:
% gmake install
If you do that, go away and take a coffee break. A long one. This may take an hour or more to build. There are a couple of ways to speed this process up. One way is to build the code without optimization:
% gmake install FOPT=-g
The code builds faster in this instance, but be warned that without optimization any generated code will run very slowly.
A better way is to do a parallel build. To do this, start an interactive queue (on discover):
% qsub -I -W group_list=g0604 -N g5Debug -l select=2:ncpus=4,walltime=03:00:00 -S /bin/tcsh -V -j eo
Note that the string following "group_list=" is your group-id code. It's the project that gets charged for the computer time you use. If you're not on "g0604" that's okay, the queue system will let you know and it won't start your job. To find out which group you belong to, issue the following command:
% getsponsor
and you'll get a table of sponsor codes available to you. Enter one of those codes as the group_list string and try again.
Wait, what have we done here? We've started an interactive queue (interactive in the sense that you have a command line) where we've now go 8 cpus allocated to us (and us alone!) for the next 3 hours. We can use all those 8 cpus to speed up our build as follows:
% gmake --jobs=8 pinstall
The syntax here is that "--jobs=" specifies the number of cpus to use (up to the 8 we've requested in our queue) and "pinstall" means to do a parallel install. Don't worry, the result should be the same as "gmake install" above but take a fraction of the time.
What if something goes wrong? Sometimes the build just doesn't go right. It's useful to save the output that scrolls by on the screen to a file so you can analyze it later. Modify any of the build examples above as follows to capture the text to a file called "make.log":
% gmake --jobs=8 pinstall |& tee make.log
and now you have a record of how the build progressed. When the build completes (successfully or otherwise) you can analyze the build results by issuing the following command:
% Config/gmh.pl -v make.log
and you'll be given a list of what compiled and didn't, which will hopefull allow you to go in and find any problems. If there are problems indicated the first pass through, try the build step again and see if they are cleared up.
If all goes well, you should have a brand-new build of GEOS-5. Step back up out of the src directory you should see the following sub-directories:
Config CVS Linux src
In the Linux directory you'll find:
bin Config doc etc include lib
The executables are in the bin directory. The resource files are in the etc directory.
In this example, the directory GEOSagcm is the root directory everything ends up under. You can specify another location by setting the environment variable ESMADIR to some other location and installing again.
How to Setup and Run an Experiment
Now that you've built the code, let's try to run it. In the exercise that follows, we will create a fresh experiment.
In what follows I will assume we are working on the NCCS computer discover.
Before we get going, let's make some light edits to your .cshrc file. First, near the top of your .cshrc file add the word:
unlimit
We're not sure what this means, but Arlindo says it is important.
Let's make sure that your binaries from your compiled GEOS-5 code are in your path. Include the following line somewhere in your .cshrc file:
setenv PATH .:$NOBACKUP/GEOSagcm/Linux/bin:$PATH
where obviously you replace the particular path to the binaries with your path. Note what this implies: it won't be a good idea to move or clean this directory while the model is running!
Create the Experiment Directories
From your model build src directory, go into Applications/GEOSgcm_App. Then run gcm_setup.
% cd Applications/GEOSgcm_App % ./gcm_setup
Your will be prompted to answer some questions. Here they are with some typical responses:
Enter the Experiment ID: bR_control Enter a 1-line Experiment Description: bR_control,descriptive_text,no_spaces_allowed Enter the Lat/Lon Horizontal Resolution: IM JM or ..... the Cubed-Sphere Resolution: cNN 144 91 [Presumption is we're not doing cubed-sphere] Enter the Model Vertical Resolution: LM (Default: 72) [Enter for default; have not tested alternative vertical resolutions] Do you wish to run GOCART? (Default: NO or FALSE) YES Enter the GOCART Emission Files to use: "CMIP" (Default), "PIESA", or "OPS": PIESA [chooses AeroCom emissions] Enter the AERO_PROVIDER: GOCART (Default) or PCHEM: [Enter for default] Enter Desired Location for HOME Directory (to contain scripts and RC files) [Enter for default] Enter Desired Location for EXP Directory (to contain model output and restart files) [Enter for default] Enter Location for Build directory containing: src/ Linux/ etc... [Enter for default] Current GROUPS: [List of groups you belong to] [Choose group]
Upon completion, you will have two directories, the HOME directory which is in your home space (something like ~/geos5/EXPID, where EXPID is the Experiment ID you named) and the EXP directory which is in your nobackup space.
Setup the Experiment HOME Directory
Go to the HOME directory. You need to edit a couple of files to get the experiment to run the way you want:
AGCM.rc
This file controls the model run characteristics. For example, for our tag (and answering the setup questions as above) the AGCM.rc file has the following lines characteristic (to run GOCART with interactive aerosol forcing):
GOCART_INTERNAL_RESTART_FILE: gocart_internal_rst GOCART_INTERNAL_RESTART_TYPE: binary GOCART_INTERNAL_CHECKPOINT_FILE: gocart_internal_checkpoint GOCART_INTERNAL_CHECKPOINT_TYPE: binary #AEROCLIM: ExtData/CMIP/L72/aero_clm/gfedv2.aero.eta.%y4%m2clm.nc
- AEROCLIMDEL: ExtData/CMIP/L72/aero_clm/gfedv2.del_aero.eta.%y4%m2clm.nc
- AEROCLIMYEAR: 2002
DU_OPTICS: ExtData/CMIP/x/opticsBands_DU.v4.nc SS_OPTICS: ExtData/CMIP/x/opticsBands_SS.v3.nc SU_OPTICS: ExtData/CMIP/x/opticsBands_SU.v3.nc OC_OPTICS: ExtData/CMIP/x/opticsBands_OC.nc BC_OPTICS: ExtData/CMIP/x/opticsBands_BC.nc NUM_BANDS: 18
DIURNAL_BIOMASS_BURNING: yes
RATS_PROVIDER: PCHEM # options: PCHEM, GMICHEM, STRATCHEM (Radiatively active tracers) AERO_PROVIDER: GOCART # options: PCHEM, GOCART (Radiatively active aerosols)
ANALYSIS_OX_PROVIDER: PCHEM # options: PCHEM, GMICHEM, STRATCHEM, GOCART
To run with climatological aerosol forcing, modify the appropriate lines above to look like:
AEROCLIM: ExtData/AeroCom/L72/aero_clm/gfedv2.aero.eta.%y4%m2clm.nc AEROCLIMDEL: ExtData/AeroCom/L72/aero_clm/gfedv2.del_aero.eta.%y4%m2clm.nc AEROCLIMYEAR: 2002 AERO_PROVIDER: PCHEM # options: PCHEM, GOCART (Radiatively active aerosols)
HISTORY.rc
This file controls the output streams from the model. Mine has the following collections:
COLLECTIONS: 'geosgcm_prog' 'geosgcm_surf' 'geosgcm_moist' 'geosgcm_turb' 'geosgcm_gwd' 'geosgcm_tend' 'geosgcm_bud' 'tavg2d_aer_x' 'inst3d_aer_v' 'tavg3d_aer_p' 'tavg2d_chm_x' 'tavg3d_chm_p' 'inst2d_hwl_x' 'inst2d_force_x' 'inst3d_force_p' ::
CAP.rc
This file controls the timing of the model run. For example, you specify the END_DATE, the number of days to run per segment (JOB_SGMT) and the number of segments (NUM_SGMT). If you modify nothing else, the model would run until it reached END_DATE or had run for NUM_SGMT segments. Note that you might think that BEG_DATE specifies a beginning date for the model to run. It is set by default to BEG_DATE: 18910301 000000, which is probably before the period you want to simulate. The actual model start time is specified later when we set up your EXPDIR.
gcm_run.j
This is the model run script. You might not need to edit anything, but check in here to see if we are pointing to the right emission files. This is system dependent. For our tag, on discover, there is a line:
setenv CHMDIR /discover/nobackup/projects/gmao/share/dao_ops/fvInput_nc3
which points to the emission files presumed in this run. You might have to modify this line on a different system or to use different emission inventories.
An Optional Modification
For some applications it is desirable to have the model stop running to generate restarts on fixed dates and then start up again. (Each instance of the model running for some number of days and then stopping and generating restarts is called a segment.) For example, suppose we want restarts at the beginning and middle of the month. I've added this functionality, but you need to make a few edits. What works is to ensure your model runs to the middle of the first month (in practice, the 17th of the month), stopping it, and then from that point to the 1st of the next month. Here's how: edit CAP.rc so that
END_DATE: 20070101 210000 JOB_SGMT: 00000017 000000 NUM_SGMT: 12
substituting the appropriate YYYYMM in END_DATE for the overall time you want your experiment to end.
By default the gcm_run.j on this tag has the following environment variables set like this:
set STOP_MID_MONTH = 1 set DFLT_JOB_SGMT = 16
Here's what is happening: overall model control is handled by CAP.rc. We specify the model finish time with END_DATE in CAP.rc. As set up, JOB_SGMT is the number of days to run the model in a single segment, and NUM_SGMT is the number of segments to run in a single queue submission. With STOP_MID_MONTH set to 1 what happens is that the JOB_SGMT environment variable is tweaked for each half-month of the model run. That is, the first segment runs the initial number of days (in this case, 17) and the next segment runs for however many days are needed to reach the 1st of the next month. (In this example, I'm presuming we're starting from an initial date of 19991231, which requires us to run for 17 days to cause the model to stop on 20000117, which is the desired monthly mid-point.) What's really happening is that CAP.rc is being edited after each segment to specify the (hopefully) correct number of days to run the next segment. Here, NUM_SGMT is set to 12, which will run the model for 6 months in a single queue submission (1/2 month per segment). This seems to fit conservatively for our resolution in a single 8-hour wallclock time request.
If you don't care about any of this, you can set STOP_MID_MONTH = 0, which bypasses this logic and all segments are exactly as long as JOB_SGMT specifies.
Running on Pleiades
This tag has successfully checked out, built, run, and archived results on NAS machine Pleiades. No modifications to scripts are needed, as the appropriate choices for that machine are handled in gcm_setup.
Setup the EXP directory
Now go to your EXP directory.
You want to specify the model start time at this point. Create a file called cap_restart". The contents of the file should read:
YYYYMMDD HHMMSS
where you replace the strings above with the year-month-day and hour-minute-second of your desired model start time. For example:
20071231 210000
You might want to edit the resource files in the RC sub-directory (e.g., GEOS_ChemGridComp.rc to turn on GOCART and Chem_Registry.rc to turn components on/off).
You also need some restarts to run the model. For Fortuna at "b" resolution you can copy a set (valid nominally at 20071231) from /discover/nobackup/pcolarco/restart0/piesa/Fortuna-2_4/ (on pleiades: /nobackupp10/pcolarco/restart0/piesa/Fortuna-2_4/). You can just copy all the files *.rst into your experiment directory, but what I like to is make a "restart0" directory in my EXPDIR and copy the files there, and then copy back up to the EXPDIR.
Run the Simulation
At this point we're ready to take a crack at running the model. Go into your experiment HOME directory. You can start the model job by issuing:
% qsub gcm_run.j
And check the progress by issuing:
% qstat | grep YOUR_USERNAME