Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift

In biological research domains, liquid chromatography–mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encounte...

Full description

Bibliographic Details
Main Authors: Jonas Rodriguez, Lina Gomez-Cano, Erich Grotewold, Natalia de Leon
Format: Article
Language:English
Published: MDPI AG 2022-05-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/12/5/435
_version_ 1797497951092736000
author Jonas Rodriguez
Lina Gomez-Cano
Erich Grotewold
Natalia de Leon
author_facet Jonas Rodriguez
Lina Gomez-Cano
Erich Grotewold
Natalia de Leon
author_sort Jonas Rodriguez
collection DOAJ
description In biological research domains, liquid chromatography–mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encountered and can pose a significant challenge to unveiling biologically relevant information. In large-scale studies, signal drift and batch effects are how technical errors are most commonly manifested. We developed pseudoDrift, an R package with capabilities for data simulation and outlier detection, and a new training and testing approach that is implemented to capture and to optionally correct for technical errors in LC–MS metabolomic data. Using data simulation, we demonstrate here that our approach performs equally as well as existing methods and offers increased flexibility to the researcher. As part of our study, we generated a targeted LC–MS dataset that profiled 33 phenolic compounds from seedling stem tissue in 602 genetically diverse non-transgenic maize inbred lines. This dataset provides a unique opportunity to investigate the dynamics of specialized metabolism in plants.
first_indexed 2024-03-10T03:26:28Z
format Article
id doaj.art-d62567f83f754188b0a6121d2fc34c2e
institution Directory Open Access Journal
issn 2218-1989
language English
last_indexed 2024-03-10T03:26:28Z
publishDate 2022-05-01
publisher MDPI AG
record_format Article
series Metabolites
spelling doaj.art-d62567f83f754188b0a6121d2fc34c2e2023-11-23T12:07:25ZengMDPI AGMetabolites2218-19892022-05-0112543510.3390/metabo12050435Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDriftJonas Rodriguez0Lina Gomez-Cano1Erich Grotewold2Natalia de Leon3Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USADepartment of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USADepartment of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USADepartment of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USAIn biological research domains, liquid chromatography–mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encountered and can pose a significant challenge to unveiling biologically relevant information. In large-scale studies, signal drift and batch effects are how technical errors are most commonly manifested. We developed pseudoDrift, an R package with capabilities for data simulation and outlier detection, and a new training and testing approach that is implemented to capture and to optionally correct for technical errors in LC–MS metabolomic data. Using data simulation, we demonstrate here that our approach performs equally as well as existing methods and offers increased flexibility to the researcher. As part of our study, we generated a targeted LC–MS dataset that profiled 33 phenolic compounds from seedling stem tissue in 602 genetically diverse non-transgenic maize inbred lines. This dataset provides a unique opportunity to investigate the dynamics of specialized metabolism in plants.https://www.mdpi.com/2218-1989/12/5/435maizemetabolomicsLC–MSsignal driftdata normalization
spellingShingle Jonas Rodriguez
Lina Gomez-Cano
Erich Grotewold
Natalia de Leon
Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
Metabolites
maize
metabolomics
LC–MS
signal drift
data normalization
title Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
title_full Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
title_fullStr Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
title_full_unstemmed Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
title_short Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
title_sort normalizing and correcting variable and complex lc ms metabolomic data with the r package pseudodrift
topic maize
metabolomics
LC–MS
signal drift
data normalization
url https://www.mdpi.com/2218-1989/12/5/435
work_keys_str_mv AT jonasrodriguez normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift
AT linagomezcano normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift
AT erichgrotewold normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift
AT nataliadeleon normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift