Skip to content

Obs catalogue.

Almost everyone working in the weather and climate sciences, at one time or another, will need to make use of a gridded observational data product. Such datasets are typically created from raw observational data (in the form of direct instrumental observations or satellite estimates) that are then processed (e.g. merged, homogenised and interpolated) and presented on a uniform temporal and spatial grid, for ease of use by the research community. In some cases these raw data, either before or after the initial processing stage, have been reanalysed by a state-of-the-art data assimilation and modelling system, in order to produce a temporally and spatially consistent multi-variable dataset (known as a ‘reanalysis’ dataset).

I’ve never been able to find a simple, concise and up-to-date summary of the latest, widely used, global and spatially complete observational datasets, so I’ve tried to provide one here (see this post for the rationale behind creating this catalogue). There are of course many other regional, spatially incomplete, superseded and less widely used datasets out there, however those are beyond the scope of a simple summary like this. If you’d like more detailed information on any of the datasets, I’d highly recommend the Climate Data Guide and Comments and suggested updates/improvements to the catalogue are welcome!


Name or Institution Reference Notes
Surface air temperature
GISTEMP National Aeronautics and Space Administration (NASA) Goddard Institute for Space Studies, Surface Temperature Analysis Hansen et al. (2010) 1950 –
HadCRUT4 Met Office Hadley Centre and Climate Research Unit Morice et al. (2012) 1850 –
NCDC National Oceanic and Atmospheric Administration (NOAA), National Climatic Data Centre Merged Land-Ocean surface temperature analysis Smith et al. (2008) 1880 –
Tropospheric and lower stratospheric temperature
RSS Remote Sensing Systems Mears & Wentz (2009) 1979 –
4 layers
UAH University of Alabama in Huntsville Christy et al. (2000) 1979 –
4 layers
Sea surface temperature
ERSST.v4 Extended Reconstructed Sea Surface Temperature analysis, version 4 Huang et al. (2014) 1854 –
OISST.v2 Optimum Interpolation Sea Surface Temperature, version 2 Reynolds et al. (2002) 1981 –
HadISST1 Hadley Centre Sea Ice and Sea Surface Temperature data set, version 1.1 Rayner et al. (2003) 1870 –
COBE Centennial in-situ Observation Based Estimates Ishii et al. (2005) 1891 –
CMAP Climate Prediction Centre Merged Analysis of Precipitation Xie & Arkin (1997) 1979 –
GPCPv2.2 Global Precipitation Climatology Project, version 2.2 Adler et al., 2003 1979 –
GPCPv1.1 Global Precipitation Climatology Project, version 1.1 1996 –
TMPA Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis, product 3B43 Huffman et al. (2007) 1998 –
JRA-55 Japanese 55-year reanalysis Kobayashi et al. (2015) 3rd gen
1958 –
Merra Modern Era Retrospective-analysis for Research and Applications Rienecker et al. (2011) 3rd gen
1979 –
ERA-Interim European Centre for Medium-Range Weather Forecasts (ECMWF) Dee et al. (2011) 3rd gen
1979 –
CFSR Climate Forecast System Reanalysis Saha et al. (2010) 3rd gen
ERA-40 European Centre for Medium-Range Weather Forecasts (ECMWF) Uppala et al. (2005) 2nd gen
NCEP/NCAR R-1 National Centers for Environmental Prediction / National Center for Atmospheric Research, reanalysis-1 Kistler et al. (2001) 1st gen
1948 –
ERA-20C European Centre for Medium-Range Weather Forecasts (ECMWF) Poli et al. (2016) 1900-2010
Twentieth Century Reanalysis, V2 National Oceanic and Atmospheric Administration (NOAA) Compo et al. (2011) 1871-2010

Surface air temperature datasets

There are three major gridded station based global temperature datasets, which are produced by the following organisations: National Aeronautics and Space Administration Goddard Institute for Space Studies (NASA GISS), National Oceanic and Atmospheric Administration National Climatic Data Centre (NOAA NCDC), Met Office Hadley Centre and Climatic Research Unit (HadCRUT).

While all three datasets are purely station based, the methodology used by each to determine which stations are appropriate for inclusion and to ‘fill the gaps’ between stations differs. For instance:

  • HadCRUT uses temperature data from ~4,400 measuring stations, while NCDC and GISS use more than 7,200 (i.e. HadCRU samples a limited subset of the entire database, preferring stations with longer records)
  • GISS fills in the gaps in data sparse areas with data from the nearest land stations, while HadCRUT leaves out data sparse areas (e.g. the Arctic Ocean) altogether.

All three global station based datasets are presented as anomalies relative to some base period. The reason to work with anomalies, rather than absolute temperature, is that absolute temperature varies markedly over short distances, while monthly or annual temperature anomalies are representative of a much larger region. Indeed, it has been shown that temperature anomalies are strongly correlated out to distances of the order of 1000 km. Unfortunately, the three data sets all use a different base period: 1951-80 (GISS), 1961-90 (HadCRU), entire 20th Century (NCDC).
Tropospheric and lower stratospheric temperature datasets

For more information, see the Satellite Temperature Measurements Wikipedia page.

There are two major gridded satellite based global temperature datasets, which are produced by the following organisations: University of Alabama in Huntsville (UAH) and Remote Sensing Systems (RSS).

It is important to note that satellites do not measure temperature directly. They measure radiances in various wavelength bands, which must then be mathematically inverted to obtain indirect inferences of temperature. The resulting temperature profiles depend on details of the methods that are used to obtain temperatures from radiances. As a result, different groups that have analysed the satellite data have produced differing temperature data sets.

The satellite temperature series produced by UAH and RSS are also not fully homogeneous – they are constructed from a series of satellites with similar but not identical instrumentation. The sensors deteriorate over time, and corrections are necessary for orbital drift and decay. Particularly large differences between reconstructed temperature series occur at the few times when there is little temporal overlap between successive satellites, making calibration difficult.
Sea surface temperature datasets

There are four globally complete sea surface temperature datasets that are commonly used: ERSST.v3b, OISST.v2, HadISST and COBE. While all four are produced by combining ship and buoy observations with satellite estimates, subtle differences exist with respect to the precise source of the observations included and the methods used to blend the data and ‘fill in’ the temporal and spatial gaps. It should also be noted that ERSST.v3b and OISST.v2 provide error estimates with their data.

The details of a sea surface temperature dataset intercomparison project that is currently underway can be found here.
Precipitation datasets

There are two major multi-decadal gridded global rainfall datasets: Global Precipitation Climatology Project (GPCP) and Climate Prediction Centre Merged Analysis of Precipitation (CMAP). Both datasets blend satellite and gauge estimates of rainfall and extend back to 1979, however the precise source of the gauge and satellite data differs slightly between the two datasets, as does the blending methodology (Gruber et al. 2000). The main differences relate to the fact that CMAP uses numerical weather prediction data to ‘fill the gaps’ in data sparse regions and makes use of atoll gauge measurements over the ocean, while GPCP does not.

It should be noted that CMAP and GPCP are generally considered to provide a more accurate representation of global rainfall than current reanalysis products (e.g. Beranger et al. 2006; Bosilovich et al. 2008). There is also some suggestion that GPCP provides a more credible precipitation climatology over tropical ocean regions (Yin et al. 2004).

In addition to these two relatively long term but coarse resolution datasets, two shorter term (i.e. mid-1990s onwards) high resolution gridded datasets have been developed: GPCP version 1.1 and TRMM Multi-Satellite Precipitation Analysis (TMPA). Like GPCP, the TMPA dataset combines satellite estimates (from TRMM and other satellites) and rain gauge data.
Reanalysis datasets

For more information, see and/or the Climate Data Guide reanalysis overview. The Web-based Reanalyses Intercomparison Tools (WRIT) are also a useful resource.

Reanalysis datasets are created using an unchanging (“frozen”) data assimilation scheme and model(s) which ingest all available observations every 6-12 hours over the period being analyzed. This unchanging framework provides a dynamically consistent estimate of the climate state at each time step. The one component of this framework which does vary is the sources of the raw input data. This is unavoidable due to the ever changing observational network which includes, but is not limited to, radiosonde, satellite, buoy, aircraft and ship reports. Currently, approximately 7-9 million observations are ingested at each time step. Over the duration of each reanalysis product, the changing observation mix can produce artificial variability and spurious trends. Still, the various reanalysis products have proven to be quite useful when used with appropriate care.

Reanalysis products are generally referred to as an ‘n-th generation reanalysis,’ based on the incremental advancement of reanalysis techniques

Third generation: JRA-55, CFSR, ERA-Interim, Merra
Second generation: ERA-40, NCEP/DOE R-2, JRA-25
First generation: NCEP/NCAR R-1, ERA-15

Users are generally advised to use the third generation products, however since these only extend back to 1979, some authors will still use either ERA-40 or NCEP/NCAR R-1 to go back further in time (which is why they were included in the table above). This practice will probably stop once JRA-55 gets established.

It should be noted that CFSR is the first global, coupled (i.e. atmosphere and ocean) reanalysis. It is only weakly coupled (see Dee et al 2014 for a discussion of what that means), so ECMWF are currently working on ERA-CLIM2, which will be fully coupled (due for release in 2016 at the earliest).

The ERA-20C and NOAA twentieth century reanalysis (version 2) are additional reanalysis datasets that were created by only assimilating surface pressure reports (and marine wind observations in the case of ERA-20C) and using observed monthly sea-surface temperature and sea-ice distributions as boundary conditions. By using this limited subset of the full observational record, these datasets were able to go further back in time and are expected to contain less spurious trends due to changes in the availability of observational data over time. However, they would also be expected to be less accurate than other reanalysis datasets at any given time point, because they have not made use of all available observations.



Leave a Comment
  1. drclimate / Apr 4 2013 16:20

    The latest ocean reanalysis is the Ocean ReAnalysis System 4 (ORAS4)

  2. drclimate / Apr 4 2013 16:26

    In February 2013, NCDC announced the release of the Upper Air Temperature (UAT) Climate Data Record (CDR) (

    It is a new satellite derived product that gives temperatures at 4 different atmospheric layers. The data can be found at: (you want the top dataset…RSS MSU/AMSU-A Mean Layer Atmospheric Temperatures)

    I’ve emailed the NCDC staff to ask how this dataset differs from the RSS and UAH datasets that are already available.

  3. Damien Irving / Nov 6 2013 13:53

    HadISST2 has been developed (version; – they are just waiting for the accompanying papers (there will be 3) to be accepted before it is released publically. It has been shown to be an improvement over HadISST1. As the long version number suggests, they are planning on updating it regularly.

    The first paper in the series is here:

  4. Damien Irving / Jul 14 2014 13:08

    It would be worth keeping an eye on new precipitation products that arise from the Global Precipitation Measurement (GPM) mission, which was launched in Feb 2014.

  5. Damien Irving / May 26 2016 15:32

    I should also include the Argo gridded products:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: