G5NR Data Access Guide: Difference between revisions

From GEOS-5
Jump to navigation Jump to search
Pchakrab (talk | contribs)
No edit summary
FTP is not supported anymore
 
(92 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{rightTOC}}
For questions or comments please send an email to g5nr at lists dot nasa dot gov.
For questions or comments please send an email to g5nr at lists dot nasa dot gov.


== G5NR Model Config ==
== G5NR background ==


== G5NR File Spec ==
The GEOS-5 Nature Run (Ganymed release) is a 2-year global, non-hydrostatic mesoscale simulation for the period 2005-2006. In addition to standard meteorological parameters (wind, temperature, moisture, surface pressure), this simulation includes 15 aerosol tracers (dust, seasalt, sulfate, black and organic carbon), O3, CO and CO2. This model simulation is driven by prescribed sea-surface temperature and sea-ice, daily volcanic and biomass burning emissions, as well as high-resolution inventories of anthropogenic sources.


The G5NR data files are generated using the NetCDF-4 library [link] which uses HDF-5 [link] as the underlying format. For more details please see [link].
GEOS-5 files are generated with the Network Common Data Form (NetCDF-4) library, which uses Hierarchical Data Format Version 5 (HDF-5) as the underlying format. NetCDF-4 is an open-source product of UCAR/Unidata (https://www.unidata.ucar.edu/software/netcdf/) and HDF-5 is developed by the HDF Group (http://www.hdfgroup.org/). One convenient method of reading GEOS-5 files is to use the netCDF library, but the HDF-5 library can also be used directly.
Each GEOS-5 file contains a '''collection''' of geophysical quantities that we will refer to as "fields" or "variables" as well as a set of coordinate variables that contain information about the grid coordinates. The variables as well as the complete structure of the file can be quickly listed using common utilities like <code>ncdump</code> or <code>h5dump</code>.


== Download data ==
For more details about File Spec, please see [[File:G5NR-Ganymed-7km_FileSpec-ON6-V1.0.pdf]].


==== Retrieve global data from FTP server ====
For model configuration, please see [[File:GMAO-OfficeNote-5-V1-22Oct2014.pdf]].


[NOTE: once data is made public, would we still have a username?]
== Download data files ==


The base url for G5NR data is
==== Global data ====
ftp://G5NR@:ftp.nccs.nasa.gov/c1440_NR/DATA.


At this location, the data is organized by resolution (0.5000_deg/0.0625_deg), type (const/inst/tavg/tdav), collection name, year, month and day as follows:
<!--
 
===== [[Recipe: Retrieve (global) data from FTP server|FTP]] =====
|-- resolution
-->
|  |-- type
|  |  |-- collection
|  |  |  |-- year
|  |  |  |  |-- month
|  |  |  |  |  |-- day
 
[A couple of lines for how many files for inst, tavg, tdav etc.]
 
A web browser can be used to browse directories, read and retrieve files. To retrieve the collection ''inst01hr_3d_T_Cv'' for the day 2006-09-18, one needs to point to:
 
ftp://G5NR:@ftp.nccs.nasa.gov/c1440_NR/DATA/0.5000_deg/inst/inst01hr_3d_T_Cv/Y2006/M09/D18.
 
Alternately, one can use the command line tool ''wget'' to retrieve the same files as:
 
wget ftp://G5NR:@ftp.nccs.nasa.gov/c1440_NR/DATA/0.5000_deg/inst/inst01hr_3d_T_Cv/Y2006/M09/D18/*
 
The * at the end would retrieve all (24) files for the given day.
 
To download a file for a specific time, say 0900z, try
wget ftp://G5NR:@ftp.nccs.nasa.gov/c1440_NR/DATA/0.5000_deg/inst/inst01hr_3d_T_Cv/Y2006/M09/D18/c1440_NR.inst01hr_3d_T_Cv.20060918_0900z.nc4
 
==== Retrieve data subsets using download tool ====
 
The download tool can be used for server-side subsetting of collections. This can be done either via the tool's web interface or the command line.
 
* Web interface - The base url for the download tool for G5NR data is
 
http://portal.nccs.nasa.gov/cgi-lats4d/opendap.cgi?&path=/OSSE/GEOS-5.12.
 
To download a subset of the collection ''inst01hr_3d_T_Cv'', from the above url, one can follow links [0.5000_deg &rarr; inst &rarr; Download (for the appropriate collection)] to the collection's download page at http://portal.nccs.nasa.gov/cgi-lats4d/webform.cgi?&i=OSSE/GEOS-5.12/0.5000_deg/inst01hr_3d_T_Cv. Here one can select variables, the begin and end times, vertical levels (for 3D data), spatial subset (by drawing a box or specifying the lats/lons) and download the resulting data in a format of choice (compressed NetCDF-4, little endian binary etc.).
 
* Command line/Python script - There has to be a way to construct the POST request and then potentially use the  Python requests module???
 
== OPeNDAP client access ==
 
OPeNDAP is a data server architecture that allows users to use data files that are stored on remote computers with their favorite analysis and visualization tools. Opening an OPeNDAP file is as easy replacing the file name in the client software by an OPeNDAP URL. All G5NR collections that are provided by ftp/download-tool are also available on the OPeNDAP server
 
http://opendap.nccs.nasa.gov/dods/OSSE/GEOS-5.12/BETA9.
 
The metadata of our example collection, ''inst01hr_3d_T_Cv'', can be viewed by following the links [0.5000_deg → inst → info (for the appropriate collection)] to the info page http://opendap.nccs.nasa.gov/dods/OSSE/GEOS-5.12/BETA9/0.5000_deg/inst/inst01hr_3d_T_Cv.info.


For retrieving aggregated data from the OPeNDAP server using your favorite client, see [[#Client access|Client access]] below.
===== [[Recipe: Retrieve (global) data from HTTPS server|HTTPS]] =====


== Client access ==
==== Data subsets ====
===== [[Recipe: Retrieve data subsets using download tool|Download tool]] =====


In the following, we read the field 'T' (air temperature) from collection ''inst01hr_3d_T_Cv'', compute its min/max and if applicable, plot it. We give an example for each of the two cases
== Read downloaded data files ==
==== [[Recipe: Fortran program to read data from downloaded file|Fortran program]] ====
==== [[Recipe: C program to read data from downloaded file|C program]] ====
==== [[Recipe: Python program to read data from downloaded file|Python script]] ====
==== [[Recipe: Matlab program to read data from downloaded file|Matlab script]] ====
==== [[Recipe: IDL program to read data from downloaded file|IDL script]] ====
==== [[Recipe: Visualize downloaded data using Panoply|Panoply]] ====


# a file has been downloaded either via ftp or using the download tool
== OPeNDAP access ==
# using the OPenDAP server


For each case, we compute min/max for both
OPeNDAP is a data server architecture that allows users to use data files that are stored on remote computers with their favorite analysis and visualization tools. Opening an OPeNDAP file is as easy replacing the file name in the client software by an OPeNDAP URL. All G5NR collections that are provided by https/download-tool are also available on the OPeNDAP server
# global temperature
# temperature over North America


==== Programming ====
https://opendap.nccs.nasa.gov/dods/OSSE/G5NR/Ganymed/7km


===== [[Recipe: Fortran program as OPeNDAP client|Fortran client]] =====
===== [[Recipe: C program as OPeNDAP client|C client]] =====
===== [[Recipe: Python program as OPeNDAP client|Python client]] =====
===== [[Recipe: Matlab program as OPeNDAP client|Matlab client]] =====
===== [[Recipe: IDL program as OPeNDAP client|IDL client]] =====
===== [[Recipe: Visualize OPeNDAP data using Panoply|Panoply]] =====
<!--
<!--
===== [[G5NR data access using C|C]] =====
-->


===== Fortran =====
For reading a downloaded file or accessing directly via OPeNDAP using Fortran, please see [[G5NR data access using Fortran|this]] page.
<!--
===== Shmem example =====
===== Shmem example =====
-->
==== Free clients ====
In this section we read air temperature, compute it min/max (as with the 'programming' examples) and display the surface air temperature.
===== Python =====
====== netcdf4-python ======
If netcdf4-python module is available, the following script would read air temperature for the specified time, compute its min and max values and plot it.
<syntaxhighlight lang="python" line>
#!/usr/bin/env python                                                                               
import sys
import numpy as np
import netCDF4 as nc4
import matplotlib.pyplot as plt                                                                     
                                                                                                     
from mpl_toolkits.basemap import Basemap                                                             
rootgrp = nc4.Dataset('http://opendap.nccs.nasa.gov:9090/dods/OSSE/GEOS-5.12/BETA9/0.5000_deg/inst/inst01hr_3d_T_Cv', 'r')
# read air temperature                                                                               
print 'Reading T for time=37...',; sys.stdout.flush()
Ttime37 = rootgrp.variables['t'][36,:,:,:]
print 'done.'; sys.stdout.flush()
# min/max                                                                                           
print 'min(T):', np.min(Ttime37)
print 'max(T):', np.max(Ttime37)
# set up cylindrical map                                                                             
m = Basemap(                                                                                         
    projection='cyl',                                                                               
    llcrnrlat=-90, urcrnrlat=90,                                                                     
    llcrnrlon=-180, urcrnrlon=180,                                                                   
    resolution='c'                                                                                   
    )                                                                                               
m.drawcoastlines(linewidth=0.5)                                                                     
m.drawmapboundary()                                                                                 
                                                                                                     
# plot contour                                                                                       
level = 71                                                                                           
X = np.arange(-180.0, 180.0, .5)                                                                     
Y = np.arange(-90.0, 90.1, .5) # 90 is the last element                                             
cp = plt.contour(X, Y, T[0,level,:,:], 20, zorder=2)                                                 
plt.clabel(cp, inline=1, fontsize=9)                                                                 
plt.title('Air temperature at the surface')                                                         
plt.show()                                                                                           
</syntaxhighlight>
====== pygrads ======


===== R =====
===== R =====


This example requires the [http://cran.r-project.org/web/packages/ncdf4/index.html ncdf4] and [http://cran.r-project.org/web/packages/rworldmap/index.html rworldmap] packages.
This example requires the [https://cran.r-project.org/web/packages/ncdf4/index.html ncdf4] and [https://cran.r-project.org/web/packages/rworldmap/index.html rworldmap] packages.


<syntaxhighlight lang="rsplus">
<syntaxhighlight lang="rsplus">
Line 151: Line 64:
> jm <- 361
> jm <- 361
> lm <- 72
> lm <- 72
> nc <- nc_open("http://opendap.nccs.nasa.gov:9090/dods/OSSE/GEOS-5.12/BETA9/0.5000_deg/inst/inst01hr_3d_T_Cv")
> nc <- nc_open("https://opendap.nccs.nasa.gov:9090/dods/OSSE/GEOS-5.12/BETA9/0.5000_deg/inst/inst01hr_3d_T_Cv")
< t <- ncvar_get(nc,"t",start=c(1,1,1,37),count=c(im,jm,lm,1))
< t <- ncvar_get(nc,"t",start=c(1,1,1,37),count=c(im,jm,lm,1))
> str(t)
> str(t)
Line 165: Line 78:
===== IDV =====
===== IDV =====


[http://www.unidata.ucar.edu/software/idv/ IDV] is an OPeNDAP tool that can access and display the nature run data. In our OPenDAP server, all files are time aggregated, so they appear as a single dataset for each location.
[https://www.unidata.ucar.edu/software/idv/ IDV] is an OPeNDAP tool that can access and display the nature run data. In our OPenDAP server, all files are time aggregated, so they appear as a single dataset for each location.


This is an example to open and display the field 'T' (air temperature) from the collection 'inst01hr_3d_T_Cv'. The OPenDAP URL for this dataset is http://opendap.nccs.nasa.gov:80/dods/OSSE/GEOS-5.12/BETA9/0.5000_deg/inst/inst01hr_3d_T_Cv. The following steps are valid for IDV version 5.0u1 running on a Linux desktop.
This is an example to open and display the field 'T' (air temperature) from the collection 'inst01hr_3d_T_Cv'. The OPenDAP URL for this dataset is https://opendap.nccs.nasa.gov:80/dods/OSSE/GEOS-5.12/BETA9/0.5000_deg/inst/inst01hr_3d_T_Cv. The following steps are valid for IDV version 5.0u1 running on a Linux desktop.


From the 'Dashboard' panel
From the 'Dashboard' panel
Line 175: Line 88:
* Select Field Selector and choose the 3D field'air_temperature'. The 'Times' tab lists all the available levels and times for this data. At this point, one can select specific times, level and regions (subsetting) from the 'Times' and 'Level' and 'Region' tabs. Click on 'Create Display'.
* Select Field Selector and choose the 3D field'air_temperature'. The 'Times' tab lists all the available levels and times for this data. At this point, one can select specific times, level and regions (subsetting) from the 'Times' and 'Level' and 'Region' tabs. Click on 'Create Display'.


==== Proprietary clients ====


===== Matlab =====
-->
 
===== IDL =====

Latest revision as of 11:17, 10 April 2019

For questions or comments please send an email to g5nr at lists dot nasa dot gov.

G5NR background

The GEOS-5 Nature Run (Ganymed release) is a 2-year global, non-hydrostatic mesoscale simulation for the period 2005-2006. In addition to standard meteorological parameters (wind, temperature, moisture, surface pressure), this simulation includes 15 aerosol tracers (dust, seasalt, sulfate, black and organic carbon), O3, CO and CO2. This model simulation is driven by prescribed sea-surface temperature and sea-ice, daily volcanic and biomass burning emissions, as well as high-resolution inventories of anthropogenic sources.

GEOS-5 files are generated with the Network Common Data Form (NetCDF-4) library, which uses Hierarchical Data Format Version 5 (HDF-5) as the underlying format. NetCDF-4 is an open-source product of UCAR/Unidata (https://www.unidata.ucar.edu/software/netcdf/) and HDF-5 is developed by the HDF Group (http://www.hdfgroup.org/). One convenient method of reading GEOS-5 files is to use the netCDF library, but the HDF-5 library can also be used directly.

Each GEOS-5 file contains a collection of geophysical quantities that we will refer to as "fields" or "variables" as well as a set of coordinate variables that contain information about the grid coordinates. The variables as well as the complete structure of the file can be quickly listed using common utilities like ncdump or h5dump.

For more details about File Spec, please see File:G5NR-Ganymed-7km FileSpec-ON6-V1.0.pdf.

For model configuration, please see File:GMAO-OfficeNote-5-V1-22Oct2014.pdf.

Download data files

Global data

HTTPS

Data subsets

Download tool

Read downloaded data files

Fortran program

C program

Python script

Matlab script

IDL script

Panoply

OPeNDAP access

OPeNDAP is a data server architecture that allows users to use data files that are stored on remote computers with their favorite analysis and visualization tools. Opening an OPeNDAP file is as easy replacing the file name in the client software by an OPeNDAP URL. All G5NR collections that are provided by https/download-tool are also available on the OPeNDAP server

https://opendap.nccs.nasa.gov/dods/OSSE/G5NR/Ganymed/7km
Fortran client
C client
Python client
Matlab client
IDL client
Panoply