Run GEOS-5 on NCCS Haswells: Difference between revisions
Line 113: | Line 113: | ||
Now, to submit to the Haswells, you need to add: | Now, to submit to the Haswells, you need to add: | ||
#SBATCH | #SBATCH --constraint=hasw |
Latest revision as of 11:25, 9 March 2015
NOTE: This page is fluid. As various code bases and compilers are updated, this information could change.
The introduction of the Haswells at NCCS have led to some interesting times with GEOS-5. This page will help users to transition to using this hardware as code bases adjust to this new paradigm.
Compiling
The first issue with the Haswells is that, as far as the AGCM is concerned, Intel 13 just does not seem to work. It builds, but when it tries to run on the Haswells (or, rather, SLES 11 SP3), it dies immediately. So, to run the model, one needs to update to Intel 15. Note, as far as the AGCM goes, Intel 15 seems to be zero-diff, but this is not guaranteed. In regards to the DAS, this has not been fully tested yet, but the anticipation is for it to be non-zero-diff do to different compiler optimizations.
AGCM
To prepare the AGCM for building with Intel 15, there are a set of files that must be updated to a new tag.
Ganymed-4_x
For tags Ganymed-4_0 and Ganymed-4_1, you should update three files (under GEOSagcm/src):
g5_modules Config/ESMA_arch.mk GMAO_Shared/GFDL_fms/FMS_arch.mk
to the tag:
mat-Intel15-forG41
The CVS command to run would be:
cvs upd -r mat-Intel15-forG41 g5_modules Config/ESMA_arch.mk GMAO_Shared/GFDL_fms/FMS_arch.mk
Heracles-1_0
For Heracles-1_0 AGCM tags, the file (relative to GEOSagcm/src):
g5_modules
should be updated to the tag:
mat-Intel15-forH10U-2015Feb06
The CVS command to run would be:
cvs upd -r mat-Intel15-forH10U-2015Feb06 g5_modules
DAS
For the DAS (GEOSadas-5_13_1_UNSTABLE or EnADAS-5_13_6 era), you would update the files (relative to GEOSadas/src):
g5_modules Config/ESMA_arch.mk GMAO_Shared/GFDL_fms/FMS_arch.mk Applications/GEOSgcm_App/GNUmakefile Applications/GSI_App/GNUmakefile Applications/NCEP_Paqc/oiqc/GNUmakefile NCEP_Shared/NCEP_crtm/CRTM_MW_Water_SfcOptics.f90 NCEP_Shared/NCEP_crtm/CRTM_SfcOptics.f90
to the tag:
mat-Intel15-forDAS5D1U-2015Feb06
The CVS command to run would be:
cvs upd -r mat-Intel15-forDAS5D1U-2015Feb06 g5_modules Config/ESMA_arch.mk GMAO_Shared/GFDL_fms/FMS_arch.mk \ Applications/GEOSgcm_App/GNUmakefile Applications/GSI_App/GNUmakefile Applications/NCEP_Paqc/oiqc/GNUmakefile \ NCEP_Shared/NCEP_crtm/CRTM_MW_Water_SfcOptics.f90 NCEP_Shared/NCEP_crtm/CRTM_SfcOptics.f90
Make System Files
It is anticipated the make system files above (*.mk and GNUmakefile) might be patched in tags soon. If so, these could be removed from this list.
NCEP_crtm
Note that above two files in NCEP_crtm are italicized. These files are the only true bug fix/workarounds needed for Intel 15, specifically, Intel 15.0.0.090. This bug is thought to be fixed in later versions of Intel 15 recently installed on Discover. If these bugs are
Running
At present in all versions of the model, to fully utilize all 28 cores on the new Haswell machines, changes will need to be made to run scripts. This is mainly due to assumptions in MAPL that are being fixed in future version. Simply, at present the safest way to run with MAPL is to have all nodes used in a run of GEOS-5 have the same number of cores. That is, it does not accept a job where, say, Node 1 has 28 cores, Node 2 has 27 cores, Node 3 28 cores, etc. This is a possibility with SLURM.
There are two ways around this. First you can idle cores on a node and run, say, 24 cores per node with:
#SBATCH --ntasks-per-node=24
However, if you want to use all the cores, you'll need to change the layout.
AGCM
Edits to AGCM.rc
For the AGCM, you will want to edit the NX: and NY: entries such that they will fill the nodes and respect the cubed-sphere dynamics requirement that NY be divisible by 6. Some sample geometries that respect this are:
CS Resolution | Lat-Lon Equiv | NX | NY | Ntasks |
---|---|---|---|---|
c48 | 2° | 7 | 12 | 84 |
c90 | 1° | 7 | 24 | 168 |
c180 | ½° | 14 | 24 | 336 |
c360 | ¼° | 14 | 48 | 672 |
Note that in future unstable tags--Heracles-1_0_UNSTABLE, say--changes will soon be in place in MAPL that will allow much more flexible core placement. When these changes are in, all geometries that respect the cubed-sphere restriction would work.
Edits to gcm_run.j
To use the Haswells, you'll need to change a few of the #SBATCH flags at the top of the script. First, make sure that the number of tasks you are requesting matches what ever you've used in AGCM.rc. For example, if you are running 7x12, be sure to use:
#SBATCH --ntasks=84
Now, to submit to the Haswells, you need to add:
#SBATCH --constraint=hasw