Accuracy and quality of weather data
Posted on 13 Jul 2020
Text by: Ilya Kalimulin
Data Scientist, OpenWeather
Figures by: Sergey Perminov
Software Engineer, OpenWeather
Nowadays everyone is constantly checking weather forecasts on their smartphones. Some of us might wonder how accurate the forecast is, and why it is constantly changing. We at OpenWeather use our own tools to monitor accuracy and quality metrics, to make sure we provide better data to our customers through our API. In this article, we describe our methods, and present the values of metrics that show how accurate our forecasts are.
Our numerical weather prediction model
To provide weather data through our API, we use our own numerical weather prediction (NWP) model, which uses several data sources:
- Global NWP models:
- NOAA GFS 0.25 and 0.5 grid sizes
- NOAA CFS
- ECMWF ERA
- Weather stations:
- METAR stations
- Users’ stations
- Companies’ stations
- Weather radar data
- Satellite data
We download and save data from these sources. Then it is processed by our in-house set of algorithms, to improve its quality and accuracy. This data processing is being done in real time, to provide the latest nowcasts and forecasts to our clients.
Metrics
To compare forecasts, we need to choose reliable sources. We use several sources that can be considered to be reliable. In general, these are weather stations run by meteorological agencies. For precipitation, they are weather radar sources.
There are plenty of metrics for evaluating the quality of weather forecasts, both for a common purpose and for special purposes. We divide them into three groups:
- Common scores, intended for forecast users, which show in general the accuracy that our clients can rely on.
- Metrics to compare raw data sources, and post-processing algorithms that we use to choose between them.
- Diagnostic metrics applied to localise certain types of errors in forecasts for further improvement.
In this article, we discuss scores from the first group. Other metrics are intended for internal use.
List of cities
We used 371 cities for evaluation. This list consists of national capitals and many other major cities.
Nowcast errors for temperature
Current weather is also a type of prediction, because global NWP models and weather stations cannot provide a regular one-minute time step, and not all sites have stations. This type of prediction is known as nowcasting.
We are going to consider several metrics to check how accurate the nowcast is. Two of these are the most commonly used statistical error metrics for temperature prediction: Mean Absolute Error (MAE) evaluates the average error, while Root Mean Square Error (RMSE) focuses on larger errors. The Reliability and Inaccuracy metrics allow us to get a qualitative description of accuracy in percentage terms. All calculations bellow were conducted in degrees Celsius.
- MAE – absolute difference from stations, in degrees; lower is better.
- RMSE – in degrees; lower is better.
- Reliability – percentage of time when model values were within ±2 degrees of ground truth; higher is better. We will regard an error of up to 2 degrees in the nowcast as acceptable.
- Inaccuracy – percentage of time when model values were not within ±5 degrees of ground truth; lower is better. We will regard an error of more than 5 degrees in the nowcast as inaccurate.
The numbers used as the thresholds for reliability and inaccuracy might vary for different industries.
The period of the analysis is from 13 April 2020 00:00 to 26 June 2020 00:00 (UTC time zone).
Figure 1. Quality metrics for Nowcast. (—) OpenWeather NWP model, ( —) and (—) raw data from NOAA GFS05 and GFS025, ( —) and (—) NOAA GFS05 and GFS025 corrected by our algorithm, ( —) weather provider 1
We analysed the behaviour of metrics for the OpenWeather NMP model during two months. The figure shows that MAE is about 0.5 degrees, RMSE is less than 2 degrees, reliability is between 90% and 100%, and inaccuracy is about 1% (less is better). It is clear that the OpenWeather NWP model provides the most accurate result. However, we should keep in mind that the actual difference between the nowcast and the real situation at a specific place and time could be bigger than these average errors.
Conclusion
This is our first article about evaluating accuracy and quality. We have analysed four common metrics for temperature nowcasting. In the future, we will add more weather parameters, such as precipitation, humidity and wind. Another direction for enhancement is by applying more metrics. We hope the article will help you to make decisions about using our API and other products. If you have any questions or suggestions, please do not hesitate to write to us.