Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data

The glm R package is commonly used for generalized linear modeling. In this paper, we evaluate the ability of the glm package to predict binomial outcomes using logistic regression. We use single-cell RNA-sequencing datasets, after a series of normalization, to fit data into glm models repeatedly us...

Full description

Bibliographic Details
Main Authors: Omar Alaqeeli, Raad Alturki
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/20/11512
_version_ 1827721872895115264
author Omar Alaqeeli
Raad Alturki
author_facet Omar Alaqeeli
Raad Alturki
author_sort Omar Alaqeeli
collection DOAJ
description The glm R package is commonly used for generalized linear modeling. In this paper, we evaluate the ability of the glm package to predict binomial outcomes using logistic regression. We use single-cell RNA-sequencing datasets, after a series of normalization, to fit data into glm models repeatedly using 10-fold cross-validation over 100 iterations. Our evaluation criteria are glm’s Precision, Recall, F1-Score, Area Under the Curve (AUC), and Runtime. Scores for each evaluation category are collected, and their medians are calculated. Our findings show that glm has fluctuating Precision and F1-Scores. In terms of Recall, glm has shown more stable performance, while in the AUC category, glm shows remarkable performance. Also, the Runtime of glm is consistent. Our findings also show that there are no correlations between the size of fitted data and glm’s Precision, Recall, F1-Score, and AUC, except for Runtime.
first_indexed 2024-03-10T21:28:47Z
format Article
id doaj.art-d64e02f50b2b4b399d8943fd1b4f91f7
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T21:28:47Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-d64e02f50b2b4b399d8943fd1b4f91f72023-11-19T15:33:12ZengMDPI AGApplied Sciences2076-34172023-10-0113201151210.3390/app132011512Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing DataOmar Alaqeeli0Raad Alturki1Department of Computer Science, Saudi Electronic University, Riyadh 11673, Saudi ArabiaDepartment of Computer Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi ArabiaThe glm R package is commonly used for generalized linear modeling. In this paper, we evaluate the ability of the glm package to predict binomial outcomes using logistic regression. We use single-cell RNA-sequencing datasets, after a series of normalization, to fit data into glm models repeatedly using 10-fold cross-validation over 100 iterations. Our evaluation criteria are glm’s Precision, Recall, F1-Score, Area Under the Curve (AUC), and Runtime. Scores for each evaluation category are collected, and their medians are calculated. Our findings show that glm has fluctuating Precision and F1-Scores. In terms of Recall, glm has shown more stable performance, while in the AUC category, glm shows remarkable performance. Also, the Runtime of glm is consistent. Our findings also show that there are no correlations between the size of fitted data and glm’s Precision, Recall, F1-Score, and AUC, except for Runtime.https://www.mdpi.com/2076-3417/13/20/11512generalized linear modelclassificationsingle-cell RNA-sequencingprecisionrecallF1-Score
spellingShingle Omar Alaqeeli
Raad Alturki
Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
Applied Sciences
generalized linear model
classification
single-cell RNA-sequencing
precision
recall
F1-Score
title Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
title_full Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
title_fullStr Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
title_full_unstemmed Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
title_short Evaluating the Performance of the Generalized Linear Model (glm) R Package Using Single-Cell RNA-Sequencing Data
title_sort evaluating the performance of the generalized linear model glm r package using single cell rna sequencing data
topic generalized linear model
classification
single-cell RNA-sequencing
precision
recall
F1-Score
url https://www.mdpi.com/2076-3417/13/20/11512
work_keys_str_mv AT omaralaqeeli evaluatingtheperformanceofthegeneralizedlinearmodelglmrpackageusingsinglecellrnasequencingdata
AT raadalturki evaluatingtheperformanceofthegeneralizedlinearmodelglmrpackageusingsinglecellrnasequencingdata