A Quick Guide to Reanalysis Datasets

Long-term observations of weather variables, such as temperature, precipitation, humidity and wind are used to track changes in climate. In parts of Canada, and the rest of the world, the spatial density of rigorous, long-term weather observations is sparse for a number of reasons. To address this sparsity, other types of data are combined with station observations using a weather model to provide a more consistent description of historical changes in weather. This article describes one of the approaches used to provide a more spatially and temporally complete description of past weather and climate conditions, namely, reanalysis. Learn about reanalysis datasets, understand how they work, and their uses and limitations in this article.

Time to completion
10 min

Key Messages

  • Reanalysis datasets combine weather observations with weather models to produce seamless weather records across space and time.  
  • Reanalysis datasets are particularly useful in regions with limited weather station coverage. 
  • Reanalysis datasets usually contain many variables, including those measured by weather stations (such as temperature and precipitation) and those that are solely available from weather models (such as temperature and pressure at different levels of the atmosphere) 
  • Because reanalysis datasets are the product of both observations and weather model outputs, these datasets have limitations that derive from the spatial and temporal resolution and quality of the weather observations and the inherent uncertainty in the weather model used in the analysis. 

Introduction – what are reanalysis datasets?

Reanalysis datasets are commonly referred to as “maps without gaps”.   

Long, rigorous and consistent weather records are important for understanding weather and climate. An accurate representation of Earth’s historical climate helps us understand how it is changing. In most countries, station-based weather measurements have not been taken consistently over time or space1. Other methods of measuring weather variables, such as radar or satellite observations, do not have consistent temporal or spatial coverage either. Reanalysis datasets combine the many methods used to measure weather variables with weather models, a technique referred to as data assimilation. These gridded historical datasets are complete spatial and temporal representations of past weather and are useful for climate studies.   

Reanalysis datasets provide a three-dimensional picture of past weather2. They typically have horizontal spatial resolutions between 10 km x 10 km grids to 100 km x 100 km grids, whereas the coverage and resolution of vertical levels (heights) will vary between reanalysis products. The temporal resolution ranges from hours to days. Since weather observations have been re-analyzed using a weather model to produce the new dataset, the product is referred to as a “reanalysis”.  

How are reanalysis datasets produced?

To produce reanalysis datasets, weather models combine (or assimilate) historical observational data from sources such as weather stations, weather balloons, aircraft, ships, and satellites, using atmospheric physics to blend overlapping information and fill in missing weather measurementsTherefore, a reanalysis produces a global or regional picture of past weather that is as close to reality as possible, at consistent time scales, from the Earth’s surface to the top of the atmosphere. Reanalysis datasets leverage the strengths of observations and weather models to provide a contiguous representation of weather in time and space. Another way of describing reanalysis is a re-interpretation of past observed data. 

Uses of reanalysis datasets

Given the wide variety of variables available and the thorough coverage of reanalysis datasets in both space and time, these datasets have many uses. Their main uses are explained below.

 

Filling in gaps in space and time

Weather observations from around the globe are not evenly distributed in space nor do they cover the same periods of time. In Canada, for example, most of the weather stations are within the densely populated regions of the country. One of the major uses of reanalysis datasets is to better understand how the climate has changed in regions with a limited number of weather stations. A major benefit of reanalysis datasets is that they provide a more complete representation of the state of the atmosphere over time and space than observations alone. This makes reanalysis datasets useful for sectors that require consistent and reliable measurements, such as wind and hydroelectric power generation. It is important to remember, however, that reanalyses are not able to capture local effects at a resolution finer than the reanalysis’ resolution.

 

Accessing variables that are not measured

Since atmospheric models that produce reanalysis datasets simulate the physical processes occurring in the atmosphere, reanalysis often includes variables that are sometimes not routinely measured by weather stations. Examples of such variables include soil temperature and the height of clouds.

 

Validating climate model output

Reanalysis data are used often in research, in conjunction with or in place of weather station observations, to verify the performance of outputs from climate models. This validation process requires consistent measurements for a variety of variables across space and time, preferably gridded, and therefore the use of reanalysis data is often more appropriate in these cases.

 

Use of reanalysis datasets for developing climate scenarios

Reanalysis datasets can be used as a reference dataset when bias-adjusting outputs from climate models and when downscaling to a resolution more suitable for decision-making.  A reanalysis dataset (ERA5-land) was used to bias-adjust and downscale climate model outputs to create the Humidex projections for Canada3 available on ClimateData.ca.

Limitations

Reanalysis datasets can be considered as very good estimates of atmospheric variables because they are anchored by both observations and weather model outputs. However, they have some limitations. These limitations are mainly a result of the following: 

 

Weather observation accuracy.

If there are inaccuracies or biases in the weather station observations, these will be reproduced in the reanalysis.

 

Sparse weather observation network and short observation periods.

Weather observations are sparse for large regions of Canada, and, at many locations, the observations have short records. Regions with less data to input to the reanalysis will have more inaccuracies than regions with more input data. However, in regions with fewer observations, reanalysis may still be valuable. Reanalysis datasets outperform gridded observational datasets in data-sparse regions, such as mountainous terrain4.

Since reanalysis products are the result of combining observations with modelled data, reanalysis results most closely match observations in areas where the observations are dense and complete.  In areas with sparse observations, the weather model outputs dominate the results (the model outputs are less influenced by the observations because there are fewer observations).  As such, reanalysis products have a higher degree of uncertainty in regions where observations are sparse.  Since different reanalysis products use different combinations of observations and weather models, it is useful to compare the outputs of these products for areas with sparse measurements to better understand uncertainty.


Weather model accuracy

No physical model of the weather is perfect and, therefore, inaccuracies and simplifications in the model can impact the reanalysis data.

 

The temporal and spatial resolution of the reanalysis dataset

Reanalysis provides a consistent estimation of the atmosphere over space and time, with the output produced in grid cell format just like that of global climate models. Given that these grids reflect the average of the area covered by the grid cell, they may not reflect specific locations within the grid cell. Finer resolution grids with smaller grid cells and more frequent time steps, e.g., hourly, result in a more accurate reanalysis dataset than those which are created at coarser resolutions.

References

  1. Mekis, 2018. An overview of surface-based precipitation observations at environment and climate change Canada. Atmosphere-Ocean, 56(2), 71-95.
  2. ECMWF, 2024. Climate reanalysis. https://www.ecmwf.int/en/research/climate-reanalysis. Accessed date 1/17/24
  3. Chow, K. K. C., Sankaré, H., Diaconescu, E. P., Murdock, T. Q., & Cannon, A. J. Bias‐adjusted and downscaled humidex projections for heat preparedness and adaptation in Canada. Geoscience Data Journal.
  4. Essou, G. R., Brissette, F., & Lucas-Picher, P. (2017). The use of reanalyses and gridded observations as weather input data for a hydrological model: Comparison of performances of simulated river flows based on the density of weather stations. Journal of Hydrometeorology, 18(2), 497-513.