RealClimate: Operationalizing Climate Science

There is a need to make climate science more agile and more responsive, and that means moving (some of it) from research to operations.

Readers here will know that the climate science community has had a hard time giving quantitative explanations for what’s happened in climate over the last couple of decades. Similarly, we are still using scenarios that were designed more than a decade ago and have not been updated to take account of the myriad changes that have happened since. Many people have noticed these problems and so there are many ideas floating around to fix them.

As someone who works in one of the main modeling groups that provide their output to the IPCC and NCA assessments, and whose models inform the downscaled projections that are used in a lot of climate resilience work, I’ve been active in trying to remedy this state of affairs. For the CERESMIP project (Schmidt et al., 2023) we proposed updating the forcings datasets and redoing much of the attribution work that had been done before to focus specifically on explaining the trends in the CERES time period (2003 to present).

And in this week’s New York Times, Zeke Hausfather and I have an opinion piece arguing that climate science more broadly – and the CMIP process specifically – needs to become more operational. To be clear, this is not a radical notion, nor is it a fringe idea that only we are thinking about. For example, there was a workshop last month in the UK, where discussion of the inputs for the next round of CMIP simulations (CMIP7 for those keeping count) included a lot of discussion about what a ‘sustained’ [footnote1] mode of extensions and updates to the input datasets would look like (and it’s definitely worth scrolling through some of the talks). Others have recently argued for a separate set of new institutions to run operational climate services (Jakob et al, 2023; Stevens, 2024).

Our opinion piece though was very focused on one key aspect – the updating of forcing data files, and the standardization of historical extension simulations by the modeling groups. This has come to the forefront partly because of the difficulties we have had as a community in explaining recent temperature anomalies, and partly as a response to the widespread frustration with the slow pace at which the scenarios and projections are being updated (e.g. Hausfather and Peters (2021)). Both of these issues stem from the realization that climate change is no longer purely a long-term issue for which an assessment updated every decade is sufficient.

The End of History

A big part of the effort to both understand past climate and project future climate is supported by the CMIP program. This is a bottom-up, basically self-organized, effort from the modeling groups to coordinate on what kinds of experiments to run with their models, what kind of data to output, and how one should document these efforts. Since its debut in the early 1990s, this process has become more complex as models have become more complex and the range of useful questions that can be asked of the models has broadened. Where, at the beginning, there was really only one input parameter (the CO₂ concentration) that needed to be coordinated, the inputs have now broadened to include myriad forcings related to other greenhouse gases, air pollution, land surface change, ozone, the sun, volcanoes, irrigation, meltwater etc.

Since CMIP3, one of the key sets of experiments has been the ‘historical’ simulations (and various variations on that theme). These are by far the most downloaded datasets and are used by thousands of researchers to evaluate the models over the instrumental period (starting in 1850). But when does ‘history’ end? [footnote2]

In modeling practice, ‘history’ stops a few years before the simulations need to start to impact the IPCC reports. So for 2007 report, the CMIP3 simulations were carried out around 2003, and so history stopped at the end of 2000. For CMIP5, history stopped in 2005, and for CMIP6 (the last go-around), it stopped in 2014. You will note that this is a decade ago.

Forcing the Issue

Depending on the specific forcing, the observations that go into the forcing datasets are available are with different latencies. For instance, sea surface temperatures are available basically in real time, solar irradiance is available after a few days, greenhouse gases a few weeks etc. However, aerosol emissions are not directly observed, but rather are estimated based on economic data that often doesn’t get released for months. Other forcings, like the irrigation data or other land use changes can take years to process and update. In practice, the main bottleneck is the estimate of the emissions of short lived climate forcings (reactive gases, aerosols, etc.), which include things like the emissions from marine shipping. Changes in the other long latency forcings aren’t really expected to have noticeable impacts on an annual or sub-decadal time-scales.

One perennial issue is also worth noting here; over the ~170 years of the historical records there are almost no totally consistent datasets. As instrumentation improved, coverage improved, and when satellite records started to be used, there are changes to the precision, variance, and bias over time. This can partially be corrected for, but for some models, for instance, the switch from decadal averages of biomass burning in the past to monthly varying data in recent years led to quite substantial increases in impacts (since the model’s response was highly non-linear) Fasullo et al., 2022.

Partially in response to this inhomogeneity over time, many of these forcings are in part modeled. For instance, solar irradiance is only directly measured after 1979, and before that has to be inferred from proxy information like sunspot activity. So not only do forcing datasets have to be extended with new data as time passes, but they frequently revise past estimates based on changes to the source data estimates or updates in the modeling. Often the groups do the extension and the update at the same time, which means that the dataset is not continuous with what had been used in the last set of simulations, making hard to do extensions without going back to the beginning.

How far does it go?

One thing that has only become apparent to me in recent months (and this is true for many in the CMIP community), is how widely used the CMIP forcing data has become far outside the original purpose. It turns out that building consistent long term syntheses of climate drivers is a useful activity. For instance, both the ECMWF reanalysis (ERA5) and the MERRA2 effort used the CMIP5 forcings from 2008 onwards for their solar forcing. But these fields are the predictions made around 2004 and are now about half a solar cycle out of sync with the real world. Similarly, the aerosol fields in the UKMO decadal prediction system are from a simulation of 2016 and are assumed fixed going forward. Having updated historical data and consistent forecasts might be key in reducing forecast errors beyond the sub-seasonal timescale.

What can be done?

As we mentioned in the opinion piece, and as (I think) was agreed as a target at the recent workshop, it should be possible to get a zeroth order estimate of the last year’s data by July the following year. i.e. we should be able to get the 2024 data extension by July 2025. That is sufficient for modeling groups to be able to quickly add a year to the historical ensembles and the single forcing/grouped forcing simulations that we use for attribution studies, and for these to be analyzed in time for the WMO State of the Climate report which comes out each November.

If additionally, these extensions can be used to seed short term forecasts (say covering the next five years), they would also be usable for the initialized decadal predictions which are also started in November. Reanalyses could also make use of these short term forecasts to allow for updates in their forcing fields and help those efforts be more realistic.

Of course, the big work right now is to update and extend the historical data from 2014 to at least 2022 or, ideally, 2023, and this should be done very shortly (preliminary versions very soon, finalized versions in the new year). And given these new updated pipelines, building a consensus to extend them on an annual basis should be easier to build.

This will require a matching commitment from the climate modeling groups to do the extensions, process them and upload them to the data in a timely manner, but this is a relatively small ask compared to what they generally do for CMIP as a whole.

As John Kennedy noted recently, we need to shift more generally away from thinking about papers as the way to update our knowledge, to thinking about operational systems that automatically update (as much as possible) and that are continually available for analysis. We’ve now got used to this for surface temperatures and assorted data streams, but it needs to be more prevalent. This would make the attribution of anomalies such as we had in 2023/2024 much easier, and would reveal far more quickly whether there is something missing in our models.

Notes

[footnote1] For some reason, the word “operational” gives some program managers and agencies hives. I think this relates to a notion that making something operational is perceived as being an open-ended commitment that reduces their future autonomy in allocating funding. However, we are constantly being exhorted to do work that is R2O (‘research to operations’), but generally speaking this is assumed to be a hand-off to an existing operational program, rather than the creation of a new one. So ‘sustained’ it is.

[footnote2] Not in 1992, despite popular beliefs at the time.