BUFR files: Difference between revisions

From GEOS-5
Jump to navigation Jump to search
Msienkie (talk | contribs)
No edit summary
Msienkie (talk | contribs)
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
The GEOS-5 software uses [http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB/ NCEP's bufrlib] software.  Since the BUFR library software handles 'endian' conversion internally, it needs to read the BUFR files in native mode (without endian conversion).  Versions of the BUFR library prior to 10.2.0 use Fortran routines to carry out I/O and so the BUFR files were required to have Fortran file markers in native format.  The most recent versions of the BUFR library (implemented in DAS versions after GEOSadas-5_12) use C routines for I/O and ignore any file markers if present.
== Determining file marker type ==
== Determining file marker type ==
You can tell what kind of file markers are being used in a BUFR file by using the Unix command 'od -c', and looking at the first line of output.
You can tell what kind of file markers are being used in a BUFR file by using the Unix command 'od -c', and looking at the first line of output.
Line 34: Line 36:


Note that all of the files listed under a section in the 'obsys.rc' need to have the same kind of file markers.  If you are combining files from different sources, you may need to 'unblock', 'reblock' or 'block' the files to make them all consistent.
Note that all of the files listed under a section in the 'obsys.rc' need to have the same kind of file markers.  If you are combining files from different sources, you may need to 'unblock', 'reblock' or 'block' the files to make them all consistent.
Beginning on 31 July 2013, all the BUFR files that we receive from NCEP have been unblocked - written without any file markers.  Prior to that date we received a mixture of unblocked files and files with big-endian file markers.  The version of the 'block' routine implemented in the DAS in 2009 uses the C routines to read the BUFR files so it can read files with any file markers and write out BUFR files with the little-endian file markers.  Thus it is possible to use data both prior to 31 July 2013 and after that date by just using '.ublk' in the specification and having 'block' process all the files to write the little-endian file markers.

Latest revision as of 14:15, 23 July 2014

The GEOS-5 software uses NCEP's bufrlib software. Since the BUFR library software handles 'endian' conversion internally, it needs to read the BUFR files in native mode (without endian conversion). Versions of the BUFR library prior to 10.2.0 use Fortran routines to carry out I/O and so the BUFR files were required to have Fortran file markers in native format. The most recent versions of the BUFR library (implemented in DAS versions after GEOSadas-5_12) use C routines for I/O and ignore any file markers if present.

Determining file marker type

You can tell what kind of file markers are being used in a BUFR file by using the Unix command 'od -c', and looking at the first line of output.

BUFR files without file markers begin with the text "BUFR":

 % od -c gdas1.111202.t12z.prepbufr | head -2
0000000   B   U   F   R  \0   & 310 003  \0  \0 022  \0 003  \a  \0  \0
0000020  \v 001  \r 001  \0  \0  \0  \0  \0  \0  \0  \0   &  \0  \0 001 

If a BUFR file has file markers, they show up as the four characters before the "BUFR". The following example is for "big-endian" file markers - you can see the smaller number is at the beginning of the file marker and the bigger number is at the end:

 % od -c gdas1.111202.t12z.1bamua.tm00.bufr_d | head -2
0000000  \0  \0   & 270   B   U   F   R  \0   & 260 003  \0  \0 022  \0
0000020 003  \a  \0  \0  \v 001  \r 001  \0  \0  \0  \0  \0  \0  \0  \0 

The second example shows "little-endian" file markers - the numbers are reversed from what is in the previous example:

 % od -c d5_merra_jan98.prepbufr.20111117.t00z.blk | head -2
0000000 370   &  \0  \0   B   U   F   R  \0   & 364 003  \0  \0 022  \0
0000020 003  \a  \0  \0  \v 001  \f 001  \0  \0  \0  \0  \0  \0  \0  \0 


'obsys.rc' convention for BUFR file marker type

The pertinent suffix here is if the file has '.blk' or '.ublk' at the end, in the name given in the header of your 'obsys.rc' file.

BEGIN ncep_prep_bufr => gdas1.%y4%m2%d2.t%h2z.prepbufr.ublk

the '.ublk' denotes that these files have no f77 file markers, and the assimilation will automatically run 'block' to add the little-endian file markers to these files

BEGIN gmao_airs_bufr => gmaoairs.%y4%m2%d2.t%h2z.bufr.blk

the '.blk' denotes big-endian f77 file markers, and the assimilation will automatically run 'reblock' to convert the file markers from big-endian to little endian (or vice-versa, so be careful with the file name endings!)

Note that all of the files listed under a section in the 'obsys.rc' need to have the same kind of file markers. If you are combining files from different sources, you may need to 'unblock', 'reblock' or 'block' the files to make them all consistent.

Beginning on 31 July 2013, all the BUFR files that we receive from NCEP have been unblocked - written without any file markers. Prior to that date we received a mixture of unblocked files and files with big-endian file markers. The version of the 'block' routine implemented in the DAS in 2009 uses the C routines to read the BUFR files so it can read files with any file markers and write out BUFR files with the little-endian file markers. Thus it is possible to use data both prior to 31 July 2013 and after that date by just using '.ublk' in the specification and having 'block' process all the files to write the little-endian file markers.