Using ESMPy on Discover

From GEOS-5
Revision as of 11:50, 28 March 2017 by Mathomp4 (talk | contribs) (Load Modules: Update to snap25)
Jump to navigation Jump to search

Load Modules

Base Modules

Only a couple Compiler+MPI combinations have been tested. These examples are based on the modules used in the GEOSagcm-bridge-DEVEL git repo:

comp/intel-17.0.0.098
mpi/impi-17.0.0.098
lib/mkl-17.0.1.132
other/comp/gcc-5.3-sp3
other/SSSO_Ana-PyD/SApd_4.1.1_py2.7_gcc-5.3-sp3

And you point to the Baselibs in:

/discover/swdev/mathomp4/Baselibs/ESMA-Baselibs-5.0.4-beta25/x86_64-unknown-linux-gnu/ifort_17.0.0.098-intelmpi_17.0.0.098

Extra Modules for mpi4py and ESMPy

Next, load mpi4py and ESMPy:

$ module use -a /home/mathomp4/modulefiles
$ module load python/mpi4py/2.0.0/ifort_17.0.0.098-intelmpi_17.0.0.098 python/ESMPy/7.1.0b25/ifort_17.0.0.098-intelmpi_17.0.0.098

This should set up pretty much everything. A simple test to make sure is run:

$ python -c 'import ESMF; import mpi4py'

If that doesn't crash, signs are good.

Running Examples

Copy Examples

To run the examples that ESMF provides, copy them from:

$ cp -r /discover/swdev/mathomp4/Baselibs/TmpBaselibs/GMAO-Baselibs-5_0_2-ESMF-7_1_0_beta_snapshot_24/src/esmf/src/addon/ESMPy/examples .

Download Test NC4 Files

Run on a head node

$ python examples/run_examples_dryrun.py

This command downloads some data files used in their examples. As it needs internet access, this doesn't work on a compute node.

Run Examples

On a compute node, run "Hello World"

$ mpirun -np 6 python examples/hello_world.py
srun.slurm: cluster configuration lacks support for cpu binding
Hello ESMPy World from PET (processor) 4!
Hello ESMPy World from PET (processor) 2!
Hello ESMPy World from PET (processor) 3!
Hello ESMPy World from PET (processor) 0!
Hello ESMPy World from PET (processor) 1!
Hello ESMPy World from PET (processor) 5!

If you want to run all their examples:

$ python examples/run_examples.py --parallel

To run one individually:

$ mpirun -np 6 python examples/ungridded_dimension_regrid.py
srun.slurm: cluster configuration lacks support for cpu binding
ESMPy Ungridded Field Dimensions Example
  interpolation mean relative error = 0.000768815903364
  mass conservation relative error  = 1.49257625157e-16
$ mpirun -np 6 python examples/grid_mesh_regrid.py
srun.slurm: cluster configuration lacks support for cpu binding
ESMPy Grid Mesh Regridding Example
  interpolation mean relative error = 0.00235869859211
$ mpirun -np 6 python ./examples/cubed_sphere_to_mesh_regrid.py
srun.slurm: cluster configuration lacks support for cpu binding
ESMPy cubed sphere Grid Mesh Regridding Example
  interpolation mean relative error = 0.00302911738799
  interpolation max relative (pointwise) error = 0.0101182527126

Launching MPI

ESMPy says they have a couple ways to launch mpi as noted in the API documentation. So far, only the mpirun methods works. I cannot figure out how to run the mpi_spawn_regrid.py example. If anyone does know, please inform me and edit this section, but it might just be "You can't on a cluster" or "You can't with Intel MPI".