Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics...
Main Authors: | , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-05-01
|
Series: | Nanomaterials |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-4991/10/5/903 |
_version_ | 1797568521039773696 |
---|---|
author | Antonio Federico Angela Serra My Kieu Ha Pekka Kohonen Jang-Sik Choi Irene Liampa Penny Nymark Natasha Sanabria Luca Cattelani Michele Fratello Pia Anneli Sofia Kinaret Karolina Jagiello Tomasz Puzyn Georgia Melagraki Mary Gulumian Antreas Afantitis Haralambos Sarimveis Tae-Hyun Yoon Roland Grafström Dario Greco |
author_facet | Antonio Federico Angela Serra My Kieu Ha Pekka Kohonen Jang-Sik Choi Irene Liampa Penny Nymark Natasha Sanabria Luca Cattelani Michele Fratello Pia Anneli Sofia Kinaret Karolina Jagiello Tomasz Puzyn Georgia Melagraki Mary Gulumian Antreas Afantitis Haralambos Sarimveis Tae-Hyun Yoon Roland Grafström Dario Greco |
author_sort | Antonio Federico |
collection | DOAJ |
description | Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics. |
first_indexed | 2024-03-10T19:58:08Z |
format | Article |
id | doaj.art-16fe4d100e26449aac0a033b4f6e951a |
institution | Directory Open Access Journal |
issn | 2079-4991 |
language | English |
last_indexed | 2024-03-10T19:58:08Z |
publishDate | 2020-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Nanomaterials |
spelling | doaj.art-16fe4d100e26449aac0a033b4f6e951a2023-11-19T23:46:39ZengMDPI AGNanomaterials2079-49912020-05-0110590310.3390/nano10050903Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality DataAntonio Federico0Angela Serra1My Kieu Ha2Pekka Kohonen3Jang-Sik Choi4Irene Liampa5Penny Nymark6Natasha Sanabria7Luca Cattelani8Michele Fratello9Pia Anneli Sofia Kinaret10Karolina Jagiello11Tomasz Puzyn12Georgia Melagraki13Mary Gulumian14Antreas Afantitis15Haralambos Sarimveis16Tae-Hyun Yoon17Roland Grafström18Dario Greco19Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandCenter for Next Generation Cytometry, Hanyang University, Seoul 04763, KoreaInstitute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, SwedenCenter for Next Generation Cytometry, Hanyang University, Seoul 04763, KoreaSchool of Chemical Engineering, National Technical University of Athens, 157 80 Athens, GreeceInstitute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, SwedenNational Institute for Occupational Health, Johannesburg 30333, South AfricaFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandQSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, PolandQSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, PolandNanoinformatics Department, NovaMechanics Ltd., Nicosia 1065, CyprusNational Institute for Occupational Health, Johannesburg 30333, South AfricaNanoinformatics Department, NovaMechanics Ltd., Nicosia 1065, CyprusSchool of Chemical Engineering, National Technical University of Athens, 157 80 Athens, GreeceCenter for Next Generation Cytometry, Hanyang University, Seoul 04763, KoreaInstitute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, SwedenFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandPreprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.https://www.mdpi.com/2079-4991/10/5/903toxicogenomicstranscriptomicsRNA-SeqscRNA-Seqmicroarraydata preprocessing |
spellingShingle | Antonio Federico Angela Serra My Kieu Ha Pekka Kohonen Jang-Sik Choi Irene Liampa Penny Nymark Natasha Sanabria Luca Cattelani Michele Fratello Pia Anneli Sofia Kinaret Karolina Jagiello Tomasz Puzyn Georgia Melagraki Mary Gulumian Antreas Afantitis Haralambos Sarimveis Tae-Hyun Yoon Roland Grafström Dario Greco Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data Nanomaterials toxicogenomics transcriptomics RNA-Seq scRNA-Seq microarray data preprocessing |
title | Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data |
title_full | Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data |
title_fullStr | Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data |
title_full_unstemmed | Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data |
title_short | Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data |
title_sort | transcriptomics in toxicogenomics part ii preprocessing and differential expression analysis for high quality data |
topic | toxicogenomics transcriptomics RNA-Seq scRNA-Seq microarray data preprocessing |
url | https://www.mdpi.com/2079-4991/10/5/903 |
work_keys_str_mv | AT antoniofederico transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT angelaserra transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT mykieuha transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT pekkakohonen transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT jangsikchoi transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT ireneliampa transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT pennynymark transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT natashasanabria transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT lucacattelani transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT michelefratello transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT piaannelisofiakinaret transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT karolinajagiello transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT tomaszpuzyn transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT georgiamelagraki transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT marygulumian transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT antreasafantitis transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT haralambossarimveis transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT taehyunyoon transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT rolandgrafstrom transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata AT dariogreco transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata |