All beings on this planet are affected by atmosphericphenomena we call the weather. Because of this, humans have invented all sorts of measuring tools, and luckily have loads of data from observations. All this data has been on quite a technological journey; from being collected on paper and local servers in basements, to now living on open cloud platforms that normalize all the different sensor data for anyone to research.
AI techniques to build improved predictive models have also made way using neural networks, called deep learning. My colleagues in Google Research who study AI Weather & Climate, built a fantastic model called MetNet. It performs precipitation forecasts that we see in the evening news, at an amazingly high spatial resolution of 1 kilometer, within a time resolution of 2 minutes, for up to 12 hours. It outperforms other traditional models that forecast only to 7-8 hours. More specifically it can help forecast severe rainfall in a local region within a relatively short period.
In our 12 min YouTube episode of our People & Planet AI series, we dived into how to approach building a weather forecasting model using Google Earth Engine and Google Cloud. We also share a notebook for technical audiences to try out. The total cost of building this sample model was a total of less than $1 (as of this publishing date).
This article is a quick summary.
Table of Contents
- Physics Based vs Deep Learning Weather Models
- How to build a model with Google Cloud & Earth Engine
- Try it out!
- Follow us
If we want to understand the Earth at very high resolution, these are very large datasets. Using Google’s cloud platforms, we can download these large datasets and make them available to everybody who wants to study this work
Jason Hickey, Google AI for Weather & Climate
Physics Based vs Deep Learning Weather Models
Historically, computing weather models have been incredibly challenging. Being physics based, they have tried to simulate many combinations of natural forces interacting with each other. To have computers perform these calculations, has required writing programs with long sequences of complex rules that are computationally and time intensive.
Up until recently, this has been changing with a radically different approach using deep learning. In this manner, models can be built to find weather patterns of cloud behavior by training it with datasets of satellite images that have labels of different precipitation phenomena. This means a model does not try to reproduce entire weather systems via simulations, but instead is trained to focus its compute power on seeing visual patterns from a mosaic of pixels. You can say this is similar to how our eyes work. Such advances have also enabled other projects such as Dynamic World, which helps granularly measure changes on the Earth.
Furthermore, pairing this technique with powerful data processing hardware like GPUs and TPUs, helps users get more accurate predictions at a fraction of the cost and time.
The following figures provide a selection of forecasts from MetNet-2 compared with the physics-based ensemble HREF and the ground truth MRMS.
How to build a model with Google Cloud & Earth Engine
To get started, you will need an account with Google Earth Engine which is free for non-commercial entities and a Google Cloud account which has a free tier if you are just getting started for all users. I have broken up the products you would use to build a model by their function.
The model is for weather forecasting on a very short term period of up to 2 and 6 hours, this is known as Nowcasting.
The architecture for training and deploying our Nowcasting model begins with our inputs (the data we will use that is centralized in Earth Engine on precipitation (GPM), satellite images (GOES-16), and elevation). It ends at our outputs, which are the labels we want the model to make predictions of (millimeters of precipitation, the range of rain or snow per hour).
Dataflow is a data processing service that helps speed up the export process from hours to minutes using the Earth engine’s High-volume Endpoint. We typically use stratifiedSample to gather a balanced amount of points per classification, but because we are working with regression, we have to make buckets for our labels into classification types.
TIP: If you are knew to some of these terms (i.e. regression or classification), I recommend checking out our 9min intro to deep learning video.
We do this by turning all our numeric labels into integers so they all fall into a classification bucket. We chose 31 buckets that represent 0 to 30 millimeters of precipitation per hour.
Note we choose to make 2 predictions, one at 2 and 6 hours in the future. But you can make as many as desired. And in terms of the amount of inputs needed, the more past data the better, but we have found that at least 3 input points are needed to give enough context to make any predictions in the future.
Next we write a script to train the model in Google’s machine learning platform called Vertex AI, using PyTorch (an ML library). We chose to use 90% of our downloaded dataset for training the model and 10% for later testing the model’s accuracy on data it has never seen before. We can eventually host the model into Cloud Run, which is a web hosting service.And finally in order to visualize our predictions as a map we can use a notebook like colab, or we can bring it back into Earth Engine (you can check out our 10min land cover video and code sample for this, which shows how to classify changes on the Earth from satellite imagery using a Fully Convolutional Network).
Try it out
This is a screenshot of the 4th notebook that gives the final overview of how the model displays predictions.
This was a quick overview of how we would approach building a weather forecasting model using deep learning techniques and Google products. If you would like to try it out, check out our code sample here on GitHub (click “open in colab” at the bottom of the screen to view the tutorial in our notebook format or click this shortcut here). It’s broken up into 4 notebooks, in case people wish to skip to specific parts, otherwise you can start from notebook 1 and finish to the 4th.
Using Colab enables you to see all the code we used, or you can alternatively run the code live by clicking the “play” icon (you can enter your desired account credentials).
Thank you for your passion in using machine learning for environmental resilience. If you wish to follow more of our future content. You can find us on Twitter and YouTube.
By: Alexandrina Garcia-Verdin (Geo for Environment Developer Advocate)
Originally published at Google Cloud Blog