A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts
A recent trend in chemical synthesis is photo-catalysis, which uses photo-active catalyst materials that are semiconductor materials. A well-known electronic property of semiconducting materials is the band gap. A photo-catalyst’s desired band gap range is between 1.5 eV and 6.2 eV. A rational desig...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-09-01
|
Series: | Digital Chemical Engineering |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2772508123000273 |
_version_ | 1797798569353150464 |
---|---|
author | Avan Kumar Sreedevi Upadhyayula Hariprasad Kodamana |
author_facet | Avan Kumar Sreedevi Upadhyayula Hariprasad Kodamana |
author_sort | Avan Kumar |
collection | DOAJ |
description | A recent trend in chemical synthesis is photo-catalysis, which uses photo-active catalyst materials that are semiconductor materials. A well-known electronic property of semiconducting materials is the band gap. A photo-catalyst’s desired band gap range is between 1.5 eV and 6.2 eV. A rational design and synthesis of photo-active catalysts require knowledge of the band gap as an initial screening parameter. Herein, we propose an integrated deep learning-based framework to classify the photo-active catalysts and predict their band gap using compositional features. To this extent, we have utilized the dataset extracted from the “catalyst hub” site by web scraping with the help of a Python script. Extensive data cleaning and pre-processing are done to make input data amenable for training the models. Also, more valuable features are made using two methods: (a) one hot-encoding and (b) calculating the mean of the embeddings of catalysts computed by Mat2Vec, a pre-trained transformer-based model. With the help of this generated feature set, we have proposed a two-stage deep-learning framework for classification and regression tasks. In the first stage, a 2D-Convolutional Neural Net (CNN)-based classifier is used to classify whether a catalyst belongs to the photo-active catalyst class. After the first stage screening, in the second stage, we use a 1D-VGG-based gradient boosting framework to predict the band gap of the photo-active catalyst only using compositional features as inputs. 2D-CNN for the classification task has an accuracy of 0.903 and 0.886 for the train and test datasets, respectively. Further, the proposed integrated model that uses 1D-Convolutional layers of VGG followed by the XGBoostRegressor has a test R2 of 0.750, much higher than baseline models reported in the literature. |
first_indexed | 2024-03-13T04:05:44Z |
format | Article |
id | doaj.art-e0d4fd3a2dd740bf9bc0f060270f78bf |
institution | Directory Open Access Journal |
issn | 2772-5081 |
language | English |
last_indexed | 2024-03-13T04:05:44Z |
publishDate | 2023-09-01 |
publisher | Elsevier |
record_format | Article |
series | Digital Chemical Engineering |
spelling | doaj.art-e0d4fd3a2dd740bf9bc0f060270f78bf2023-06-21T07:01:35ZengElsevierDigital Chemical Engineering2772-50812023-09-018100109A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalystsAvan Kumar0Sreedevi Upadhyayula1Hariprasad Kodamana2Department of Chemical Engineering, Indian Institute of Technology Delhi, New Delhi, 110016, IndiaDepartment of Chemical Engineering, Indian Institute of Technology Delhi, New Delhi, 110016, IndiaDepartment of Chemical Engineering, Indian Institute of Technology Delhi, New Delhi, 110016, India; Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, New Delhi, 110016, India; Corresponding author at: Department of Chemical Engineering, Indian Institute of Technology Delhi, New Delhi, 110016, India.A recent trend in chemical synthesis is photo-catalysis, which uses photo-active catalyst materials that are semiconductor materials. A well-known electronic property of semiconducting materials is the band gap. A photo-catalyst’s desired band gap range is between 1.5 eV and 6.2 eV. A rational design and synthesis of photo-active catalysts require knowledge of the band gap as an initial screening parameter. Herein, we propose an integrated deep learning-based framework to classify the photo-active catalysts and predict their band gap using compositional features. To this extent, we have utilized the dataset extracted from the “catalyst hub” site by web scraping with the help of a Python script. Extensive data cleaning and pre-processing are done to make input data amenable for training the models. Also, more valuable features are made using two methods: (a) one hot-encoding and (b) calculating the mean of the embeddings of catalysts computed by Mat2Vec, a pre-trained transformer-based model. With the help of this generated feature set, we have proposed a two-stage deep-learning framework for classification and regression tasks. In the first stage, a 2D-Convolutional Neural Net (CNN)-based classifier is used to classify whether a catalyst belongs to the photo-active catalyst class. After the first stage screening, in the second stage, we use a 1D-VGG-based gradient boosting framework to predict the band gap of the photo-active catalyst only using compositional features as inputs. 2D-CNN for the classification task has an accuracy of 0.903 and 0.886 for the train and test datasets, respectively. Further, the proposed integrated model that uses 1D-Convolutional layers of VGG followed by the XGBoostRegressor has a test R2 of 0.750, much higher than baseline models reported in the literature.http://www.sciencedirect.com/science/article/pii/S2772508123000273Photo-active catalystBand gapDeep learning modelsCNNVGGGradient boosting |
spellingShingle | Avan Kumar Sreedevi Upadhyayula Hariprasad Kodamana A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts Digital Chemical Engineering Photo-active catalyst Band gap Deep learning models CNN VGG Gradient boosting |
title | A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts |
title_full | A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts |
title_fullStr | A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts |
title_full_unstemmed | A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts |
title_short | A Convolutional Neural Network-based gradient boosting framework for prediction of the band gap of photo-active catalysts |
title_sort | convolutional neural network based gradient boosting framework for prediction of the band gap of photo active catalysts |
topic | Photo-active catalyst Band gap Deep learning models CNN VGG Gradient boosting |
url | http://www.sciencedirect.com/science/article/pii/S2772508123000273 |
work_keys_str_mv | AT avankumar aconvolutionalneuralnetworkbasedgradientboostingframeworkforpredictionofthebandgapofphotoactivecatalysts AT sreedeviupadhyayula aconvolutionalneuralnetworkbasedgradientboostingframeworkforpredictionofthebandgapofphotoactivecatalysts AT hariprasadkodamana aconvolutionalneuralnetworkbasedgradientboostingframeworkforpredictionofthebandgapofphotoactivecatalysts AT avankumar convolutionalneuralnetworkbasedgradientboostingframeworkforpredictionofthebandgapofphotoactivecatalysts AT sreedeviupadhyayula convolutionalneuralnetworkbasedgradientboostingframeworkforpredictionofthebandgapofphotoactivecatalysts AT hariprasadkodamana convolutionalneuralnetworkbasedgradientboostingframeworkforpredictionofthebandgapofphotoactivecatalysts |