#challenge-super-resolution

Background

It is well known that Numerical Weather Prediction models and Climate Simulation models are complex computer programs that require massive amounts of computing resources. Even on supercomputers, climate model simulations realism and accuracy are limited by the available computing resources. One limiting factor is the spatial resolution at which those models are run. More resolution means better simulation, but also more computing resources required. To get their climate simulations performed in a reasonable time, climate researchers then have to limit the spatial resolution of the model they use. This is particularly true for climate models simulating the whole planet, called global climate models.

Two main approaches have been adopted to tackle this issue, both aiming at ‘downscaling*’ low-resolution climate model output. The first one, called dynamical downscaling, allows increasing the resolution, although just over a part of the planet. This is the approach for which the ESCER Centre in Montreal has been developing and using for years. The second approach, called statistical downscaling, is also a very popular approach since it requires far less computational resources. Statistical downscaling techniques have been used for decades now, multiple linear regressions being one of the most popular approaches used so far.

The challenge This challenge is a call to the AI/ML hacker community to find and apply more advanced algorithms to increase weather and climate resolution output fields. Since such model outputs can be taken as images, it is expected, but not required, that AI computer vision algorithms would be tried out on this problem, in particular those called Super Resolution. The challenge falls into the Kaggle-style competition category. That means that the hacker's models will be evaluated and ranked using a performance measurement metric. The evaluation of the team works won’t be limited to that metric, though. Originality, ease of applications will also be taken into considerations.

In more details, this challenge aims at increasing the resolution (the level of detail )of 2D surface temperature forecasts obtained from Environment and Climate Change Canada (ECCC) ’s weather forecast model, using as labelled images 2D temperature analysis at higher resolution. The scale factor between the model and the higher resolution analysis is 4 (from 10 km to 2.5 km). See the figure below for a comparison between the two resolutions. The warmest reddish features show warmer temperature prevailing in valleys over a mountainous region (Canadian Rockies). These weather forecast gridded fields are just like images and it is expected that AI computer vision would be tried out on this problem. Note that even though a Deep Learning approach for such problems can be used, weather and climate model output fields are large and can be costly to train on, especially during a 48 h hackathon.

Numerous and relevant 2D low-resolution weather forecast fields will be provided as predictors in the training set. In addition to temperature, these include fields like cloud coverage, wind, humidity, topography, etc. are also included in the training set files. We’ll also provide thousands of images of gridded temperature analysis fields as target labels to train on, that correspond to the same dates of the training fields. Participants are to try to increase the resolution of the forecast images to look as much as possible like these higher resolution temperature analysis images.

The datasets have been packaged in a way to ease their use. Note that the total size is of the order of 25 Gb. A tutorial (Python Notebook) showing how to access and display the data is provided. Note that only a subset of the full dataset is made available prior to the hackathon. The full dataset, for which the link is also provided in the tutorial, will be available after the pitch on Friday 22 Jan, around 5 pm EST (22:00 UTC).

Deliverables

Your submission should contain a short technical report, and an accompanying proof of concept (code). We also encourage you to produce forecasts for the test set, which can be evaluated against the withheld labels.

To evaluate a submission, we will consider the following:

Originality and technical properties of the solution. Is it possible to operationalize it? Would it perform well in the presence of rare events? Would it need a large amount of data to perform well, or can it be used and trained quickly? Does it bring new ideas in the field of statistical processing of weather forecasts? And so on.

Quality of the prototype. Does it illustrate the technical feasibility of the proposed solution?

Quality of the super resolution over the test set. Are the high resolution forecasts realistic? Do they have good error metrics, such as root mean square error? Do they represent a significant improvement over baseline methods?

*Note that the term ‘downscaling’ is used in the climate simulation community to express the mean by which the level of details coming from a low-resolution simulation is increased. This allows one to get more realistic climate simulation by increasing the level of details of the simulated fields, surface temperature, for instance. The downscaling technique, create information at the local scale (a few kilometers) from information at large-scale (10’s to 100’s km). Be aware, in the domain of computer vision, this process is called ‘upscaling!