Researchers Advance Prediction of Flood Season Rainfall Combining Machine Learning and Climate System Model
Updatetime:2024-12-31From:
【Enlarge】【Reduce】
With climate change driving more frequent and intense extreme precipitation events, accurately predicting rainfall during the flood season has become more critical than ever.
A recent study has used machine learning (ML) algorithms to address the nonlinear challenges in traditional models for predicting flood season rainfall, achieving significant improvements in accuracy. The findings were published in Advances in Atmospheric Sciences.
Predictions for flood season rainfall currently depend largely on outputs from climate system numerical models. These models often have systematic biases, so historical observational data are combined with statistical methods to correct the outputs and reduce errors.
This approach, known as the dynamical-statistical method, has limitations. Prediction errors from numerical models grow nonlinearly over time, and traditional correction methods, which are mainly linear, struggle to address these errors effectively.
Recognizing ML’s strength in managing nonlinear relationships, the study applied the LightGBM algorithm to enhance the dynamical-statistical correction method (Fig. 1). In trials from 2019 to 2022, predictions improved significantly, with the prediction score (PS) rising from 68.6 to 74—an increase of 7.87%. This reflects a 6.63% improvement over traditional dynamical-statistical methods, substantially boosting the accuracy of flood season rainfall predictions (Fig. 2).
Currently, many data-driven ML methods used for climate prediction are often considered to lack sufficient physical interpretability. To tackle this, the study carefully selected meteorological factors with clear physical connections to rainfall and integrated them into the climate system model. The team also quantified the contribution of each forecasting factor, offering a clearer understanding of the physical significance of the predictors used.
The study emphasizes a key point: relying solely on physical models or ML models to improve predictions of flood season rainfall has inherent limitations. This study explores a climate prediction method that effectively integrates ML with physical models. The rapidly evolving fields of artificial intelligence and big data offer new ways to optimize and refine model outputs, effectively solving nonlinear and complex challenges that traditional dynamic-statistical methods cannot. This study has explored a feasible approach to developing the traditional dynamic-statistical method into a dynamic-ML method.
Despite the progress, challenges remain. “Our next steps will focus on extracting pre-existing and real-time signals from research on flood season precipitation formation mechanisms to develop dynamic-ML method with stronger physical interpretability,” explained Dr. Haipeng Yu, the corresponding author from the Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences.
This research marks significant progress in precipitation prediction and offers valuable insights for developing future meteorological methods that integrate artificial intelligence and big data.
“Our ultimate goal is to create an efficient, stable, and interpretable system combining climate system models and ML techniques for predicting flood season rainfall, helping to mitigate the impacts of extreme precipitation and related disasters.” said Yu.
As technology advances, integrating physical mechanisms with ML-based prediction methods holds immense potential for tackling the challenges of climate change.
Fig. 1. Comparison of ACC and PS metrics for CatBoost, XGBoost, LR and LightGBM, RF, SVM, BP, and climate model output (BCC-CSM); the line plot represents ACC (left y-axis), and the bar plot represents PS (right y-axis); the x-axis represents different years from 2016 to 2022. (Image by Haipeng YU)
Fig. 2 The spatial anomaly percentage distribution for 2019–22, where the years are in increasing order from top to bottom. Panels (a), (d), (g), and (j) are the observed distribution of the four years; (b), (e), (h), and (k) are the results of the LightGBM method proposed in this study; and (c), (f), (i), and (l) are the results of the previous linear observational constraint correction model. (Image by Haipeng YU)
Appendix