Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy
The successful estimation of soil organic matter (SOM) and soil total nitrogen (TN) contents with mid-infrared (MIR) reflectance spectroscopy depends on selecting appropriate variable selection techniques and multivariate methods for regression analysis. This study aimed to explore the potential of...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-03-01
|
Series: | Agronomy |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4395/12/3/638 |
_version_ | 1797447384021598208 |
---|---|
author | Hong Li Junwei Wang Jixiong Zhang Tongqing Liu Gifty E. Acquah Huimin Yuan |
author_facet | Hong Li Junwei Wang Jixiong Zhang Tongqing Liu Gifty E. Acquah Huimin Yuan |
author_sort | Hong Li |
collection | DOAJ |
description | The successful estimation of soil organic matter (SOM) and soil total nitrogen (TN) contents with mid-infrared (MIR) reflectance spectroscopy depends on selecting appropriate variable selection techniques and multivariate methods for regression analysis. This study aimed to explore the potential of combining a multivariate method and spectral variable selection for soil SOM and TN estimation using MIR spectroscopy. Five hundred and ten topsoil samples were collected from Quzhou County, Hebei Province, China, and their SOM and TN contents and reflectance spectra were measured using DRIFT-MIR spectroscopy (diffuse reflectance infrared Fourier transform in the mid-infrared range, MIR, wavenumber: 4000–400 cm<sup>−1</sup>; wavelength: 2500–25,000 nm). Two multivariate methods (partial least-squares regression, PLSR; multiple linear regression, MLR) combined with two variable selection techniques (stability competitive adaptive reweighted sampling, sCARS; bootstrapping soft shrinkage approach, BOSS) were used for model calibration. The MLR model combined with the sCARS method yielded the most accurate estimation result for both SOM (R<sub>p</sub><sup>2</sup> = 0.72 and RPD = 1.89) and TN (R<sub>p</sub><sup>2</sup> = 0.84 and RPD = 2.50). Out of the 2382 wavenumbers in a full spectrum, sCARS determined that only 31 variables were important for SOM estimation (accounting for 1.30% of all variables) and 27 variables were important for TN estimation (accounting for 1.13% of all variables). The results demonstrated that sCARS was a highly efficient approach for extracting information on wavenumbers and mitigating redundant wavenumbers. In addition, the current study indicated that MLR, which is simpler than PLSR, when combined with spectral variable selection, can achieve high-precision prediction of SOM and TN content. As such, DRIFT-MIR spectroscopy coupled with MLR and sCARS is a good alternative for estimating the SOM and TN of soils. |
first_indexed | 2024-03-09T13:55:31Z |
format | Article |
id | doaj.art-e1fbc8f0e7784dbb957f0db6a0870eed |
institution | Directory Open Access Journal |
issn | 2073-4395 |
language | English |
last_indexed | 2024-03-09T13:55:31Z |
publishDate | 2022-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Agronomy |
spelling | doaj.art-e1fbc8f0e7784dbb957f0db6a0870eed2023-11-30T20:44:25ZengMDPI AGAgronomy2073-43952022-03-0112363810.3390/agronomy12030638Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR SpectroscopyHong Li0Junwei Wang1Jixiong Zhang2Tongqing Liu3Gifty E. Acquah4Huimin Yuan5College of Resources and Environmental Sciences, National Academy of Agriculture Green Development, Key Laboratory of Plant-Soil Interactions, Ministry of Education, China Agricultural University, Beijing 100193, ChinaCollege of Resources and Environmental Sciences, National Academy of Agriculture Green Development, Key Laboratory of Plant-Soil Interactions, Ministry of Education, China Agricultural University, Beijing 100193, ChinaCollege of Resources and Environmental Sciences, National Academy of Agriculture Green Development, Key Laboratory of Plant-Soil Interactions, Ministry of Education, China Agricultural University, Beijing 100193, ChinaCollege of Resources and Environmental Sciences, National Academy of Agriculture Green Development, Key Laboratory of Plant-Soil Interactions, Ministry of Education, China Agricultural University, Beijing 100193, ChinaDepartment of Sustainable Agriculture Sciences, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, UKCollege of Resources and Environmental Sciences, National Academy of Agriculture Green Development, Key Laboratory of Plant-Soil Interactions, Ministry of Education, China Agricultural University, Beijing 100193, ChinaThe successful estimation of soil organic matter (SOM) and soil total nitrogen (TN) contents with mid-infrared (MIR) reflectance spectroscopy depends on selecting appropriate variable selection techniques and multivariate methods for regression analysis. This study aimed to explore the potential of combining a multivariate method and spectral variable selection for soil SOM and TN estimation using MIR spectroscopy. Five hundred and ten topsoil samples were collected from Quzhou County, Hebei Province, China, and their SOM and TN contents and reflectance spectra were measured using DRIFT-MIR spectroscopy (diffuse reflectance infrared Fourier transform in the mid-infrared range, MIR, wavenumber: 4000–400 cm<sup>−1</sup>; wavelength: 2500–25,000 nm). Two multivariate methods (partial least-squares regression, PLSR; multiple linear regression, MLR) combined with two variable selection techniques (stability competitive adaptive reweighted sampling, sCARS; bootstrapping soft shrinkage approach, BOSS) were used for model calibration. The MLR model combined with the sCARS method yielded the most accurate estimation result for both SOM (R<sub>p</sub><sup>2</sup> = 0.72 and RPD = 1.89) and TN (R<sub>p</sub><sup>2</sup> = 0.84 and RPD = 2.50). Out of the 2382 wavenumbers in a full spectrum, sCARS determined that only 31 variables were important for SOM estimation (accounting for 1.30% of all variables) and 27 variables were important for TN estimation (accounting for 1.13% of all variables). The results demonstrated that sCARS was a highly efficient approach for extracting information on wavenumbers and mitigating redundant wavenumbers. In addition, the current study indicated that MLR, which is simpler than PLSR, when combined with spectral variable selection, can achieve high-precision prediction of SOM and TN content. As such, DRIFT-MIR spectroscopy coupled with MLR and sCARS is a good alternative for estimating the SOM and TN of soils.https://www.mdpi.com/2073-4395/12/3/638precision agriculturemid-infrared soil spectroscopyspectral variable selectionmultiple linear regression |
spellingShingle | Hong Li Junwei Wang Jixiong Zhang Tongqing Liu Gifty E. Acquah Huimin Yuan Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy Agronomy precision agriculture mid-infrared soil spectroscopy spectral variable selection multiple linear regression |
title | Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy |
title_full | Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy |
title_fullStr | Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy |
title_full_unstemmed | Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy |
title_short | Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation by DRIFT-MIR Spectroscopy |
title_sort | combining variable selection and multiple linear regression for soil organic matter and total nitrogen estimation by drift mir spectroscopy |
topic | precision agriculture mid-infrared soil spectroscopy spectral variable selection multiple linear regression |
url | https://www.mdpi.com/2073-4395/12/3/638 |
work_keys_str_mv | AT hongli combiningvariableselectionandmultiplelinearregressionforsoilorganicmatterandtotalnitrogenestimationbydriftmirspectroscopy AT junweiwang combiningvariableselectionandmultiplelinearregressionforsoilorganicmatterandtotalnitrogenestimationbydriftmirspectroscopy AT jixiongzhang combiningvariableselectionandmultiplelinearregressionforsoilorganicmatterandtotalnitrogenestimationbydriftmirspectroscopy AT tongqingliu combiningvariableselectionandmultiplelinearregressionforsoilorganicmatterandtotalnitrogenestimationbydriftmirspectroscopy AT giftyeacquah combiningvariableselectionandmultiplelinearregressionforsoilorganicmatterandtotalnitrogenestimationbydriftmirspectroscopy AT huiminyuan combiningvariableselectionandmultiplelinearregressionforsoilorganicmatterandtotalnitrogenestimationbydriftmirspectroscopy |