Random forests with spatial proxies for environmental modelling: opportunities and pitfalls

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Milà, Carles
  • dc.contributor.author Ludwig, Marvin
  • dc.contributor.author Pebesma, Edzer
  • dc.contributor.author Tonne, Cathryn
  • dc.contributor.author Meyer, Hannah V.
  • dc.date.accessioned 2024-11-21T07:41:39Z
  • dc.date.available 2024-11-21T07:41:39Z
  • dc.date.issued 2024
  • dc.description.abstract Spatial proxies such as coordinates and Euclidean distance fields are often added as predictors in random forest models; however, their suitability in different predictive conditions has not yet been thoroughly assessed. We investigated 1) the conditions under which spatial proxies are suitable, 2) the reasons for such adequacy, and 3) how proxy suitability can be assessed using cross-validation. In a simulation and two case studies, we found that adding spatial proxies improved model performance when both residual spatial autocorrelation, and regularly or randomly-distributed training samples, were present. Otherwise, inclusion of proxies was neutral or counterproductive and resulted in feature extrapolation for clustered samples. Random k-fold cross-validation systematically favoured models with spatial proxies even when not appropriate. As the benefits of spatial proxies are not universal, we recommend using spatial exploratory and validation analyses to determine their suitability, and considering alternative inherently spatial RF-GLS models.
  • dc.description.sponsorship Carles Milà was supported by a PhD fellowship funded by the Spanish Ministerio de Ciencia e Innovación (grant no. PRE2020-092303). We also acknowledge support from grant no. CEX2018-000806-S, funded by MCIN/AEI/10.13039/501100011033, and from the Generalitat de Catalunya through the CERCA programme.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Milà C, Ludwig M, Pebesma E, Tonne C, Meyer H. Random forests with spatial proxies for environmental modelling: opportunities and pitfalls. Geoscientific Model Development. 2024;17:6007-33. DOI: 10.5194/gmd-17-6007-2024
  • dc.identifier.doi http://dx.doi.org/10.5194/gmd-17-6007-2024
  • dc.identifier.issn 1991-959X
  • dc.identifier.uri http://hdl.handle.net/10230/68762
  • dc.language.iso eng
  • dc.publisher European Geosciences Union (EGU)
  • dc.relation.ispartof Geoscientific Model Development. 2024;17:6007-33
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PRE2020-092303
  • dc.rights © Author(s) 2024. This work is distributed under the Creative Commons Attribution 4.0 License (http://creativecommons.org/licenses/by/4.0/).
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by/4.0/
  • dc.title Random forests with spatial proxies for environmental modelling: opportunities and pitfalls
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion