Application of random forest regression to the calculation of
gas-phase chemistry within the GEOS-Chem chemistry model v10

Keller, Christoph A.; Evans, M. J.

doi:10.5194/gmd-2018-229

Cited by 23 publications

(40 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…While the correlation and MSE summary statistics are very similar for both the DNN and Random Forest regression, the Random Forest prediction time is 27% slower than the DNN. Though the model prediction time is not perfectly optimized, these results are consistent with previous work demonstrating rapid computation from DNNs (Rasp et al, ), and slower implementations of Random Forests (Keller & Evans, ). Given the computational challenges many models already face, the lack of process‐based information currently available in models of V d , and the great potential for DNN model portability and retraining (Chollet & Allaire, ), we believe that the application of a DNN for this purpose is well justified.…”

Section: Resultssupporting

confidence: 87%

“…Machine learning methods have gained popularity in recent years as high‐quality methods to make rapid and accurate predictions from data in the earth and environmental sciences. Keller and Evans () demonstrate the potential of using a Random Forest algorithm to replace gas phase chemistry in a global chemical transport model. Nowack et al () use a Ridge regression technique to linearly parameterize ozone‐temperature relationships for climate models and find that their machine learning model can predict global ozone fields quite well at a fraction of the computational cost of traditional nonparameterized methods.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Deep Learning Parameterization for Ozone Dry Deposition Velocities

Silva

Heald

Ravela

et al. 2019

Geophysical Research Letters

View full text Add to dashboard Cite

The loss of ozone to terrestrial and aquatic systems, known as dry deposition, is a highly uncertain process governed by turbulent transport, interfacial chemistry, and plant physiology. We demonstrate the value of using Deep Neural Networks (DNN) in predicting ozone dry deposition velocities. We find that a feedforward DNN trained on observations from a coniferous forest site (Hyytiälä, Finland) can predict hourly ozone dry deposition velocities at a mixed forest site (Harvard Forest, Massachusetts) more accurately than modern theoretical models, with a reduction in the normalized mean bias (0.05 versus ~0.1). The same DNN model, when driven by assimilated meteorology at 2° × 2.5° spatial resolution, outperforms the Wesely scheme as implemented in the GEOS‐Chem model. With more available training data from other climate and ecological zones, this methodology could yield a generalizable DNN suitable for global models.

show abstract

Section: Resultssupporting

confidence: 87%

Section: Introductionmentioning

confidence: 99%

A Deep Learning Parameterization for Ozone Dry Deposition Velocities

Silva

Heald

Ravela

et al. 2019

Geophysical Research Letters

View full text Add to dashboard Cite

show abstract

“…On the more interpretable end, machine learning algorithms are being used increasingly within environmental sciences, with recent examples including linear Ridge Regression and Random Forest models to replace computationally-expensive processes (Keller and Evans, 2018;Nowack et al, 2018) and Gaussian Process emulation to explore model biases on a global scale (Lee et al, 2011;Revell et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

A machine learning based global sea-surface iodide distribution

Sherwen¹,

Chance²,

Tinel³

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

Abstract. Iodide in the sea-surface plays an important role in the Earth system. It modulates the oxidising capacity of the troposphere and provides iodine to terrestrial ecosystems. However, our understanding of its distribution is limited due to a paucity of observations. Previous efforts to generate global distributions have generally fitted sea-surface iodide observations to relatively simple functions of sea-surface temperature (Chance et al., 2014; MacDonald et al., 2014). This approach fails to account for coastal influences and variation in the bio-geochemical environment. Here we use a machine learning regression approach (Random Forest Regression) to generate a high resolution (0.125° x 0.125°, ∼ 12.5 km), monthly dataset of present-day global sea-surface iodide. We use a compilation of iodide observations (Chance et al., 2019b) that is 45 % larger than has been used previously (Chance et al., 2014) as the dependent variable and co-located ancillary parameters (temperature, nitrate, phosphate, salinity, shortwave radiation, topographic depth, mixed layer depth, and chlorophyll-a) from global climatologies as the independent variables. We investigate the regression models generated using different combinations of ancillary parameters and select the ten best-performing models to be included in an ensemble prediction. We then use this ensemble of models, combined with global fields of the ancillary parameters, to predict a new high resolution global sea-surface iodide field. Sea-surface temperature is the most important variable in all of the top ten models. We estimate a global average sea-surface iodide concentration of 106 nM (with an uncertainty of ∼ 20 %), which is within the range of previous estimates (60–130 nM). Similar to previous work, higher concentrations are predicted for the tropics than for the extra-tropics. Unlike the previous parameterisations, higher concentrations are also predicted for shallow areas such as coastal regions and the South China Sea. Compared to previous work, the new parameterisation better captures observed variability. The iodide concentrations calculated here are significantly higher (40 % on a global basis) than the commonly used MacDonald et al. (2014) parameterisation, with implications for our understanding of iodine in the atmosphere. The global iodide dataset is made freely available to the community (DOI: https://doi.org/10/gfv5v3) and as new observations are made, we will update the global dataset through a "living data" model.

show abstract

“…The ensemble learning algorithms are a kind of machine learning that have been increasingly used in geoscientific applications (Catani et al, ; Keller & Evans, ; O'Gorman & Dwyer, ; Reichstein et al, ). The basic idea behind ensemble learning is to combine multiple weak learners for obtaining better predictions.…”

Section: Methodsmentioning

confidence: 99%

Can Terrestrial Water Storage Dynamics be Estimated From Climate Anomalies?

Jing

Zhao

Yao

et al. 2020

Earth and Space Science

View full text Add to dashboard Cite

Freshwater stored on land is an extremely vital resource for all the terrestrial life on Earth. But our ability to record the change of land water storage is weak despite its importance. In this study, we attempt to establish a data‐driven model for simulating terrestrial water storage dynamics by relating climate forcings with terrestrial water storage anomalies (TWSAs) from the Gravity Recovery and Climate Experiment (GRACE) satellites. In the case study in Pearl River basin, China, the relationships were learned by using two ensemble learning algorithms, the Random Forest (RF) and eXtreme Gradient Boost (XGB), respectively. The TWSA in the basin was reconstructed back to past decades and compared with the TWSA derived from global land surface models. As a result, the RF and XGB algorithms both perform well and could nicely reproduce the spatial pattern and value range of GRACE observations, outperforming the land surface models. Temporal behaviors of the reconstructed TWSA time series well capture those of both GRACE and land surface models time series. A multiscale GRACE‐based drought index was proposed, and the index matches the Standardized Precipitation‐Evapotranspiration Index time series at different time scales. The case analysis for years of 1963 and 1998 indicates the ability of the reconstructed TWSA for identifying past drought and flood extremes. The importance of different input variables to the TWSA estimation model was quantified, and the precipitation of the prior 2 months is the most important variable for simulating the TWSA of the current month in the model. Results of this study highlight the great potentials for estimating terrestrial water storage dynamics from climate forcing data by using machine learning to achieve comparable results than complex physical models.

show abstract

Application of random forest regression to the calculation of gas-phase chemistry within the GEOS-Chem chemistry model v10

Cited by 23 publications

References 36 publications

A Deep Learning Parameterization for Ozone Dry Deposition Velocities

A Deep Learning Parameterization for Ozone Dry Deposition Velocities

A machine learning based global sea-surface iodide distribution

Can Terrestrial Water Storage Dynamics be Estimated From Climate Anomalies?

Contact Info

Product

Resources

About