Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift
In biological research domains, liquid chromatography–mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encounte...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-05-01
|
Series: | Metabolites |
Subjects: | |
Online Access: | https://www.mdpi.com/2218-1989/12/5/435 |
_version_ | 1797497951092736000 |
---|---|
author | Jonas Rodriguez Lina Gomez-Cano Erich Grotewold Natalia de Leon |
author_facet | Jonas Rodriguez Lina Gomez-Cano Erich Grotewold Natalia de Leon |
author_sort | Jonas Rodriguez |
collection | DOAJ |
description | In biological research domains, liquid chromatography–mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encountered and can pose a significant challenge to unveiling biologically relevant information. In large-scale studies, signal drift and batch effects are how technical errors are most commonly manifested. We developed pseudoDrift, an R package with capabilities for data simulation and outlier detection, and a new training and testing approach that is implemented to capture and to optionally correct for technical errors in LC–MS metabolomic data. Using data simulation, we demonstrate here that our approach performs equally as well as existing methods and offers increased flexibility to the researcher. As part of our study, we generated a targeted LC–MS dataset that profiled 33 phenolic compounds from seedling stem tissue in 602 genetically diverse non-transgenic maize inbred lines. This dataset provides a unique opportunity to investigate the dynamics of specialized metabolism in plants. |
first_indexed | 2024-03-10T03:26:28Z |
format | Article |
id | doaj.art-d62567f83f754188b0a6121d2fc34c2e |
institution | Directory Open Access Journal |
issn | 2218-1989 |
language | English |
last_indexed | 2024-03-10T03:26:28Z |
publishDate | 2022-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Metabolites |
spelling | doaj.art-d62567f83f754188b0a6121d2fc34c2e2023-11-23T12:07:25ZengMDPI AGMetabolites2218-19892022-05-0112543510.3390/metabo12050435Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDriftJonas Rodriguez0Lina Gomez-Cano1Erich Grotewold2Natalia de Leon3Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USADepartment of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USADepartment of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USADepartment of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USAIn biological research domains, liquid chromatography–mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encountered and can pose a significant challenge to unveiling biologically relevant information. In large-scale studies, signal drift and batch effects are how technical errors are most commonly manifested. We developed pseudoDrift, an R package with capabilities for data simulation and outlier detection, and a new training and testing approach that is implemented to capture and to optionally correct for technical errors in LC–MS metabolomic data. Using data simulation, we demonstrate here that our approach performs equally as well as existing methods and offers increased flexibility to the researcher. As part of our study, we generated a targeted LC–MS dataset that profiled 33 phenolic compounds from seedling stem tissue in 602 genetically diverse non-transgenic maize inbred lines. This dataset provides a unique opportunity to investigate the dynamics of specialized metabolism in plants.https://www.mdpi.com/2218-1989/12/5/435maizemetabolomicsLC–MSsignal driftdata normalization |
spellingShingle | Jonas Rodriguez Lina Gomez-Cano Erich Grotewold Natalia de Leon Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift Metabolites maize metabolomics LC–MS signal drift data normalization |
title | Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift |
title_full | Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift |
title_fullStr | Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift |
title_full_unstemmed | Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift |
title_short | Normalizing and Correcting Variable and Complex LC–MS Metabolomic Data with the R Package pseudoDrift |
title_sort | normalizing and correcting variable and complex lc ms metabolomic data with the r package pseudodrift |
topic | maize metabolomics LC–MS signal drift data normalization |
url | https://www.mdpi.com/2218-1989/12/5/435 |
work_keys_str_mv | AT jonasrodriguez normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift AT linagomezcano normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift AT erichgrotewold normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift AT nataliadeleon normalizingandcorrectingvariableandcomplexlcmsmetabolomicdatawiththerpackagepseudodrift |