Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach

Hemicelluloses are amorphous polymers of sugar molecules that make up a major fraction of lignocellulosic biomasses. They have applications in the bioenergy, textile, mining, cosmetic, and pharmaceutical industries. Industrial use of hemicellulose often requires that the polymer be hydrolyzed into c...

Full description

Bibliographic Details
Main Authors:	Edward Wang, Riley Ballachay, Genpei Cai, Yankai Cao, Heather L. Trajano
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2022-10-01
Series:	Frontiers in Chemical Engineering
Subjects:	hemicellulose dilute acid hydrolysis autohydrolysis kinetics machine learning support vector regression
Online Access:	https://www.frontiersin.org/articles/10.3389/fceng.2022.994428/full

_version_	1797995832510775296
author	Edward Wang Riley Ballachay Genpei Cai Yankai Cao Heather L. Trajano Heather L. Trajano
author_facet	Edward Wang Riley Ballachay Genpei Cai Yankai Cao Heather L. Trajano Heather L. Trajano
author_sort	Edward Wang
collection	DOAJ
description	Hemicelluloses are amorphous polymers of sugar molecules that make up a major fraction of lignocellulosic biomasses. They have applications in the bioenergy, textile, mining, cosmetic, and pharmaceutical industries. Industrial use of hemicellulose often requires that the polymer be hydrolyzed into constituent oligomers and monomers. Traditional models of hemicellulose degradation are kinetic, and usually only appropriate for limited operating regimes and specific species. The study of hemicellulose hydrolysis has yielded substantial data in the literature, enabling a diverse data set to be collected for general and widely applicable machine learning models. In this paper, a dataset containing 1955 experimental data points on batch hemicellulose hydrolysis of hardwood was collected from 71 published papers dated from 1985 to 2019. Three machine learning models (ridge regression, support vector regression and artificial neural networks) are assessed on their ability to predict xylose yield and compared to a kinetic model. Although the performance of ridge regression was unsatisfactory, both support vector regression and artificial neural networks outperformed the simple kinetic model. The artificial neural network outperformed support vector regression, reducing the mean absolute error in predicting soluble xylose yield of test data to 6.18%. The results suggest that machine learning models trained on historical data may be used to supplement experimental data, reducing the number of experiments needed.
first_indexed	2024-04-11T10:07:57Z
format	Article
id	doaj.art-fa033e5c76094f39916201f2ef44dab0
institution	Directory Open Access Journal
issn	2673-2718
language	English
last_indexed	2024-04-11T10:07:57Z
publishDate	2022-10-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Chemical Engineering
spelling	doaj.art-fa033e5c76094f39916201f2ef44dab02022-12-22T04:30:11ZengFrontiers Media S.A.Frontiers in Chemical Engineering2673-27182022-10-01410.3389/fceng.2022.994428994428Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approachEdward Wang0Riley Ballachay1Genpei Cai2Yankai Cao3Heather L. Trajano4Heather L. Trajano5Department of Chemical and Biological Engineering, The University of British Columbia, Vancouver, BC, CanadaDepartment of Chemical and Biological Engineering, The University of British Columbia, Vancouver, BC, CanadaDepartment of Chemical and Biological Engineering, The University of British Columbia, Vancouver, BC, CanadaDepartment of Chemical and Biological Engineering, The University of British Columbia, Vancouver, BC, CanadaDepartment of Chemical and Biological Engineering, The University of British Columbia, Vancouver, BC, CanadaBioProducts Institute, The University of British Columbia, Vancouver, BC, CanadaHemicelluloses are amorphous polymers of sugar molecules that make up a major fraction of lignocellulosic biomasses. They have applications in the bioenergy, textile, mining, cosmetic, and pharmaceutical industries. Industrial use of hemicellulose often requires that the polymer be hydrolyzed into constituent oligomers and monomers. Traditional models of hemicellulose degradation are kinetic, and usually only appropriate for limited operating regimes and specific species. The study of hemicellulose hydrolysis has yielded substantial data in the literature, enabling a diverse data set to be collected for general and widely applicable machine learning models. In this paper, a dataset containing 1955 experimental data points on batch hemicellulose hydrolysis of hardwood was collected from 71 published papers dated from 1985 to 2019. Three machine learning models (ridge regression, support vector regression and artificial neural networks) are assessed on their ability to predict xylose yield and compared to a kinetic model. Although the performance of ridge regression was unsatisfactory, both support vector regression and artificial neural networks outperformed the simple kinetic model. The artificial neural network outperformed support vector regression, reducing the mean absolute error in predicting soluble xylose yield of test data to 6.18%. The results suggest that machine learning models trained on historical data may be used to supplement experimental data, reducing the number of experiments needed.https://www.frontiersin.org/articles/10.3389/fceng.2022.994428/fullhemicellulosedilute acid hydrolysisautohydrolysiskineticsmachine learningsupport vector regression
spellingShingle	Edward Wang Riley Ballachay Genpei Cai Yankai Cao Heather L. Trajano Heather L. Trajano Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach Frontiers in Chemical Engineering hemicellulose dilute acid hydrolysis autohydrolysis kinetics machine learning support vector regression
title	Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach
title_full	Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach
title_fullStr	Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach
title_full_unstemmed	Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach
title_short	Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach
title_sort	predicting xylose yield from prehydrolysis of hardwoods a machine learning approach
topic	hemicellulose dilute acid hydrolysis autohydrolysis kinetics machine learning support vector regression
url	https://www.frontiersin.org/articles/10.3389/fceng.2022.994428/full
work_keys_str_mv	AT edwardwang predictingxyloseyieldfromprehydrolysisofhardwoodsamachinelearningapproach AT rileyballachay predictingxyloseyieldfromprehydrolysisofhardwoodsamachinelearningapproach AT genpeicai predictingxyloseyieldfromprehydrolysisofhardwoodsamachinelearningapproach AT yankaicao predictingxyloseyieldfromprehydrolysisofhardwoodsamachinelearningapproach AT heatherltrajano predictingxyloseyieldfromprehydrolysisofhardwoodsamachinelearningapproach AT heatherltrajano predictingxyloseyieldfromprehydrolysisofhardwoodsamachinelearningapproach

Predicting xylose yield from prehydrolysis of hardwoods: A machine learning approach

Similar Items