Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data

Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics...

Full description

Bibliographic Details
Main Authors: Antonio Federico, Angela Serra, My Kieu Ha, Pekka Kohonen, Jang-Sik Choi, Irene Liampa, Penny Nymark, Natasha Sanabria, Luca Cattelani, Michele Fratello, Pia Anneli Sofia Kinaret, Karolina Jagiello, Tomasz Puzyn, Georgia Melagraki, Mary Gulumian, Antreas Afantitis, Haralambos Sarimveis, Tae-Hyun Yoon, Roland Grafström, Dario Greco
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Nanomaterials
Subjects:
Online Access:https://www.mdpi.com/2079-4991/10/5/903
_version_ 1797568521039773696
author Antonio Federico
Angela Serra
My Kieu Ha
Pekka Kohonen
Jang-Sik Choi
Irene Liampa
Penny Nymark
Natasha Sanabria
Luca Cattelani
Michele Fratello
Pia Anneli Sofia Kinaret
Karolina Jagiello
Tomasz Puzyn
Georgia Melagraki
Mary Gulumian
Antreas Afantitis
Haralambos Sarimveis
Tae-Hyun Yoon
Roland Grafström
Dario Greco
author_facet Antonio Federico
Angela Serra
My Kieu Ha
Pekka Kohonen
Jang-Sik Choi
Irene Liampa
Penny Nymark
Natasha Sanabria
Luca Cattelani
Michele Fratello
Pia Anneli Sofia Kinaret
Karolina Jagiello
Tomasz Puzyn
Georgia Melagraki
Mary Gulumian
Antreas Afantitis
Haralambos Sarimveis
Tae-Hyun Yoon
Roland Grafström
Dario Greco
author_sort Antonio Federico
collection DOAJ
description Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.
first_indexed 2024-03-10T19:58:08Z
format Article
id doaj.art-16fe4d100e26449aac0a033b4f6e951a
institution Directory Open Access Journal
issn 2079-4991
language English
last_indexed 2024-03-10T19:58:08Z
publishDate 2020-05-01
publisher MDPI AG
record_format Article
series Nanomaterials
spelling doaj.art-16fe4d100e26449aac0a033b4f6e951a2023-11-19T23:46:39ZengMDPI AGNanomaterials2079-49912020-05-0110590310.3390/nano10050903Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality DataAntonio Federico0Angela Serra1My Kieu Ha2Pekka Kohonen3Jang-Sik Choi4Irene Liampa5Penny Nymark6Natasha Sanabria7Luca Cattelani8Michele Fratello9Pia Anneli Sofia Kinaret10Karolina Jagiello11Tomasz Puzyn12Georgia Melagraki13Mary Gulumian14Antreas Afantitis15Haralambos Sarimveis16Tae-Hyun Yoon17Roland Grafström18Dario Greco19Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandCenter for Next Generation Cytometry, Hanyang University, Seoul 04763, KoreaInstitute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, SwedenCenter for Next Generation Cytometry, Hanyang University, Seoul 04763, KoreaSchool of Chemical Engineering, National Technical University of Athens, 157 80 Athens, GreeceInstitute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, SwedenNational Institute for Occupational Health, Johannesburg 30333, South AfricaFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandQSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, PolandQSAR Lab Ltd., Aleja Grunwaldzka 190/102, 80-266 Gdansk, PolandNanoinformatics Department, NovaMechanics Ltd., Nicosia 1065, CyprusNational Institute for Occupational Health, Johannesburg 30333, South AfricaNanoinformatics Department, NovaMechanics Ltd., Nicosia 1065, CyprusSchool of Chemical Engineering, National Technical University of Athens, 157 80 Athens, GreeceCenter for Next Generation Cytometry, Hanyang University, Seoul 04763, KoreaInstitute of Environmental Medicine, Karolinska Institutet, 171 77 Stockholm, SwedenFaculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, FinlandPreprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.https://www.mdpi.com/2079-4991/10/5/903toxicogenomicstranscriptomicsRNA-SeqscRNA-Seqmicroarraydata preprocessing
spellingShingle Antonio Federico
Angela Serra
My Kieu Ha
Pekka Kohonen
Jang-Sik Choi
Irene Liampa
Penny Nymark
Natasha Sanabria
Luca Cattelani
Michele Fratello
Pia Anneli Sofia Kinaret
Karolina Jagiello
Tomasz Puzyn
Georgia Melagraki
Mary Gulumian
Antreas Afantitis
Haralambos Sarimveis
Tae-Hyun Yoon
Roland Grafström
Dario Greco
Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
Nanomaterials
toxicogenomics
transcriptomics
RNA-Seq
scRNA-Seq
microarray
data preprocessing
title Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
title_full Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
title_fullStr Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
title_full_unstemmed Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
title_short Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data
title_sort transcriptomics in toxicogenomics part ii preprocessing and differential expression analysis for high quality data
topic toxicogenomics
transcriptomics
RNA-Seq
scRNA-Seq
microarray
data preprocessing
url https://www.mdpi.com/2079-4991/10/5/903
work_keys_str_mv AT antoniofederico transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT angelaserra transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT mykieuha transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT pekkakohonen transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT jangsikchoi transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT ireneliampa transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT pennynymark transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT natashasanabria transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT lucacattelani transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT michelefratello transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT piaannelisofiakinaret transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT karolinajagiello transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT tomaszpuzyn transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT georgiamelagraki transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT marygulumian transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT antreasafantitis transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT haralambossarimveis transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT taehyunyoon transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT rolandgrafstrom transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata
AT dariogreco transcriptomicsintoxicogenomicspartiipreprocessinganddifferentialexpressionanalysisforhighqualitydata