Flexible simulation of future climate change through machine learning

International team with KIT participation develops data set to enable faster calculations of climate change scenarios using machine learning methods

Climate change is a major challenge for humans and the environment. Understanding future climate scenarios is essential for mitigating and adapting to climate change. Policy makers rely heavily on future simulations of our climate to initiate concrete measures. These models simulate the climate under different scenarios based on assumptions about future global societal and industrial development and the associated greenhouse gas emissions.

However, this poses a number of challenges. Firstly, climate models are extremely computationally complex, which severely limits the number of possible scenarios that can be simulated. In this context, machine learning offers great opportunities because, once trained, such models can make predictions very quickly and relatively cheaply. However, these methods have so far concentrated mainly on current weather data or data from the last few decades (e.g. so-called reanalysis data). With this data, however, it is not possible to adequately depict the future climate, especially if we want to reliably include aspects of climate change.

In order to improve this process, a team of scientists from Germany, Israel, Canada and the USA has now developed a data set called "ClimateSet". The ClimateSet project aims to provide a large-scale and standardized data set of climate models for research in the field of machine learning.

"Climate model data, due to its enormous dimensionality, complexity and relevance to the world, is something of a dream playground and at the same time an ultimate challenge for ML model developers. But to get there, you have had to fight your way through a global maze of inconsistencies. ClimateSet is meant to be the ML community's shortcut - or in other words, a quick way to prepare for the climate crisis," says team member Charlotte Lange, a researcher at the Mila Quebec AI Institute and the University of Osnabrück.

Schematic overview of the Climate Set project (Image: Julia Kaltenborn)

ClimateSet contains inputs and outputs from 36 climate models that enable researchers to develop and evaluate ML models for various climate-related tasks. Similar projects had previously only included 1-2 models. The use of many models can now for the first time take into account the current uncertainties in climate simulations and also allows the methods to learn from significantly larger data sets, which will make it possible to answer completely new research questions. Examples of such tasks are the emulation of climate models, predictions of extreme weather in different climate scenarios or the development of higher-resolution climate model data (known as "downscaling"). The ClimateSet benchmarks focus on the emulation of climate models, i.e. the use of ML models to simulate the prediction of a climate model as accurately as possible.

"ClimateSet allows us to train ML methods on climate model data on an unprecedented scale and compare the results directly. It is therefore an important step towards faster, more comprehensive and more energy-efficient climate change predictions. It is particularly important to me that the entirety of the latest climate models can be consistently taken into account, and thus the model uncertainties can also be reflected in their breadth," explains Professor Peer Nowack from the Institute for Theoretical Computer Science at KIT, who was involved in the development of ClimateSet.

"Machine learning can help us simulate the future of our planet faster, but it is still up to people - i.e. politicians, business and society - to tackle the climate crisis. For example, ClimateSet can be used to produce more climate scenarios and more regional scenarios to support policy decisions. Machine learning gives us new insights, but we still have to act," adds Julia Kaltenborn, first author of the paper, from the Mila Quebec AI Institute.

The researchers present their findings in their paper "ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning", which was published at the "Neural Information Processing Systems" conference in New Orleans.