Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data

Arsenic (As) is a versatile heavy metalloid trace element extensively used in industrial applications. As is carcinogen, poses health risks through both inhalation and ingestion, and is associated with an increased risk of liver, kidney, lung, and bladder tumors. In the agricultural context, the rep...

Full description

Bibliographic Details
Main Authors: Arthur Nicolaus Fendrich, Elise Van Eynde, Dimitrios M. Stasinopoulos, Robert A. Rigby, Felipe Yunta Mezquita, Panos Panagos
Format: Article
Language:English
Published: Elsevier 2024-03-01
Series:Environment International
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0160412024001302
_version_ 1797248436611842048
author Arthur Nicolaus Fendrich
Elise Van Eynde
Dimitrios M. Stasinopoulos
Robert A. Rigby
Felipe Yunta Mezquita
Panos Panagos
author_facet Arthur Nicolaus Fendrich
Elise Van Eynde
Dimitrios M. Stasinopoulos
Robert A. Rigby
Felipe Yunta Mezquita
Panos Panagos
author_sort Arthur Nicolaus Fendrich
collection DOAJ
description Arsenic (As) is a versatile heavy metalloid trace element extensively used in industrial applications. As is carcinogen, poses health risks through both inhalation and ingestion, and is associated with an increased risk of liver, kidney, lung, and bladder tumors. In the agricultural context, the repeated application of arsenical products leads to elevated soil concentrations, which are also affected by environmental and management variables. Since exposure to As poses risks, effective assessment tools to support environmental and health policies are needed. However, the most comprehensive soil As data available, the Land Use/Cover Area frame statistical Survey (LUCAS) database, contains severe limitations due to high detection limits. Although within International Organization for Standardization standards, the detection limits preclude the adoption of standard methodologies for data analysis. The present work focused on developing a new method to model As contamination in European soils using LUCAS soil samples. We introduce the GAMLSS-RF model, a novel approach that couples Random Forests with Generalized Additive Models for Location, Scale, and Shape. The semiparametric model can capture non-linear interactions among input variables while accommodating censored and non-censored observations and can be calibrated to include information from other campaign databases. After fitting and validating a spatial model, we produced European-scale As concentration maps at a 250 m spatial resolution and evaluated the patterns against reference values (i.e., two action levels and a background concentration). We found a significant variability of As concentration across the continent, with lower concentrations in Northern countries and higher concentrations in Portugal, Spain, Austria, France and Belgium. By overcoming limitations in existing databases and methodologies, the present approach provides an alternative way to handle highly censored data. The model also consists of a valuable probabilistic tool for assessing As contamination risks in soils, contributing to informed policy-making for environmental and health protection.
first_indexed 2024-03-07T14:02:22Z
format Article
id doaj.art-baba776e7d49455ba48e26c30f6db0ad
institution Directory Open Access Journal
issn 0160-4120
language English
last_indexed 2024-04-24T20:14:34Z
publishDate 2024-03-01
publisher Elsevier
record_format Article
series Environment International
spelling doaj.art-baba776e7d49455ba48e26c30f6db0ad2024-03-23T06:22:18ZengElsevierEnvironment International0160-41202024-03-01185108544Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored dataArthur Nicolaus Fendrich0Elise Van Eynde1Dimitrios M. Stasinopoulos2Robert A. Rigby3Felipe Yunta Mezquita4Panos Panagos5European Commission, Joint Research Centre (JRC), Ispra, VA, Italy; Laboratoire des Sciences du Climat et de l’Environnement, CEA-CNRS-UVSQ-UPSACLAY, 91190 Gif sur Yvette, France; Université Paris-Saclay, INRAE, AgroParisTech, UMR SAD-APT, 91120 Palaiseau, France; Corresponding author at: Laboratoire des Sciences du Climat et de l’Environnement, CEA-CNRS-UVSQ-UPSACLAY, 91190 Gif sur Yvette, France.European Commission, Joint Research Centre (JRC), Ispra, VA, ItalySchool of Computing and Mathematical Sciences, University of Greenwich, Greenwich, UKSchool of Computing and Mathematical Sciences, University of Greenwich, Greenwich, UKEuropean Commission, Joint Research Centre (JRC), Ispra, VA, ItalyEuropean Commission, Joint Research Centre (JRC), Ispra, VA, ItalyArsenic (As) is a versatile heavy metalloid trace element extensively used in industrial applications. As is carcinogen, poses health risks through both inhalation and ingestion, and is associated with an increased risk of liver, kidney, lung, and bladder tumors. In the agricultural context, the repeated application of arsenical products leads to elevated soil concentrations, which are also affected by environmental and management variables. Since exposure to As poses risks, effective assessment tools to support environmental and health policies are needed. However, the most comprehensive soil As data available, the Land Use/Cover Area frame statistical Survey (LUCAS) database, contains severe limitations due to high detection limits. Although within International Organization for Standardization standards, the detection limits preclude the adoption of standard methodologies for data analysis. The present work focused on developing a new method to model As contamination in European soils using LUCAS soil samples. We introduce the GAMLSS-RF model, a novel approach that couples Random Forests with Generalized Additive Models for Location, Scale, and Shape. The semiparametric model can capture non-linear interactions among input variables while accommodating censored and non-censored observations and can be calibrated to include information from other campaign databases. After fitting and validating a spatial model, we produced European-scale As concentration maps at a 250 m spatial resolution and evaluated the patterns against reference values (i.e., two action levels and a background concentration). We found a significant variability of As concentration across the continent, with lower concentrations in Northern countries and higher concentrations in Portugal, Spain, Austria, France and Belgium. By overcoming limitations in existing databases and methodologies, the present approach provides an alternative way to handle highly censored data. The model also consists of a valuable probabilistic tool for assessing As contamination risks in soils, contributing to informed policy-making for environmental and health protection.http://www.sciencedirect.com/science/article/pii/S0160412024001302ArsenicGAMLSSRandom forestSoil contaminationStatistical modelingTrace element
spellingShingle Arthur Nicolaus Fendrich
Elise Van Eynde
Dimitrios M. Stasinopoulos
Robert A. Rigby
Felipe Yunta Mezquita
Panos Panagos
Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data
Environment International
Arsenic
GAMLSS
Random forest
Soil contamination
Statistical modeling
Trace element
title Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data
title_full Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data
title_fullStr Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data
title_full_unstemmed Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data
title_short Modeling arsenic in European topsoils with a coupled semiparametric (GAMLSS-RF) model for censored data
title_sort modeling arsenic in european topsoils with a coupled semiparametric gamlss rf model for censored data
topic Arsenic
GAMLSS
Random forest
Soil contamination
Statistical modeling
Trace element
url http://www.sciencedirect.com/science/article/pii/S0160412024001302
work_keys_str_mv AT arthurnicolausfendrich modelingarsenicineuropeantopsoilswithacoupledsemiparametricgamlssrfmodelforcensoreddata
AT elisevaneynde modelingarsenicineuropeantopsoilswithacoupledsemiparametricgamlssrfmodelforcensoreddata
AT dimitriosmstasinopoulos modelingarsenicineuropeantopsoilswithacoupledsemiparametricgamlssrfmodelforcensoreddata
AT robertarigby modelingarsenicineuropeantopsoilswithacoupledsemiparametricgamlssrfmodelforcensoreddata
AT felipeyuntamezquita modelingarsenicineuropeantopsoilswithacoupledsemiparametricgamlssrfmodelforcensoreddata
AT panospanagos modelingarsenicineuropeantopsoilswithacoupledsemiparametricgamlssrfmodelforcensoreddata