chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models

Abstract The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts...

Full description

Bibliographic Details
Main Authors: Jeremy R. Ash, Jacqueline M. Hughes-Oliver
Format: Article
Language:English
Published: BMC 2018-11-01
Series:Journal of Cheminformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13321-018-0309-4
_version_ 1818130944891027456
author Jeremy R. Ash
Jacqueline M. Hughes-Oliver
author_facet Jeremy R. Ash
Jacqueline M. Hughes-Oliver
author_sort Jeremy R. Ash
collection DOAJ
description Abstract The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts in the cheminformatics field, all of the methods in chemmodlab have broad utility for the machine learning community. chemmodlab contains several assessment utilities, including a plotting function that constructs accumulation curves and a function that computes many performance measures. The most novel feature of chemmodlab is the ease with which statistically significant performance differences for many machine learning models is presented by means of the multiple comparisons similarity plot. Differences are assessed using repeated k-fold cross validation, where blocking increases precision and multiplicity adjustments are applied. chemmodlab is freely available on CRAN at https://cran.r-project.org/web/packages/chemmodlab/index.html.
first_indexed 2024-12-11T08:13:05Z
format Article
id doaj.art-38c09a78e6fe4546a8fba2acfe5d2a5e
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-12-11T08:13:05Z
publishDate 2018-11-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-38c09a78e6fe4546a8fba2acfe5d2a5e2022-12-22T01:14:50ZengBMCJournal of Cheminformatics1758-29462018-11-0110112010.1186/s13321-018-0309-4chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning modelsJeremy R. Ash0Jacqueline M. Hughes-Oliver1Department of Statistics, Bioinformatics Research Center, North Carolina State UniversityDepartment of Statistics, North Carolina State UniversityAbstract The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts in the cheminformatics field, all of the methods in chemmodlab have broad utility for the machine learning community. chemmodlab contains several assessment utilities, including a plotting function that constructs accumulation curves and a function that computes many performance measures. The most novel feature of chemmodlab is the ease with which statistically significant performance differences for many machine learning models is presented by means of the multiple comparisons similarity plot. Differences are assessed using repeated k-fold cross validation, where blocking increases precision and multiplicity adjustments are applied. chemmodlab is freely available on CRAN at https://cran.r-project.org/web/packages/chemmodlab/index.html.http://link.springer.com/article/10.1186/s13321-018-0309-4Machine learningQSARR packageInitial enhancementEnrichment factorAccumulation curve
spellingShingle Jeremy R. Ash
Jacqueline M. Hughes-Oliver
chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
Journal of Cheminformatics
Machine learning
QSAR
R package
Initial enhancement
Enrichment factor
Accumulation curve
title chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
title_full chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
title_fullStr chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
title_full_unstemmed chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
title_short chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
title_sort chemmodlab a cheminformatics modeling laboratory r package for fitting and assessing machine learning models
topic Machine learning
QSAR
R package
Initial enhancement
Enrichment factor
Accumulation curve
url http://link.springer.com/article/10.1186/s13321-018-0309-4
work_keys_str_mv AT jeremyrash chemmodlabacheminformaticsmodelinglaboratoryrpackageforfittingandassessingmachinelearningmodels
AT jacquelinemhughesoliver chemmodlabacheminformaticsmodelinglaboratoryrpackageforfittingandassessingmachinelearningmodels