Predicting the fuel performance of coal-based liquids using the ML-QSPR method

A comprehensive understanding of the composition and physicochemical properties of coal-based liquids, such as coal tar or coal direct liquefaction oil, is conducive to the rapid development of multi-purpose, high-performance and high-value-added products and the efficient use of oil properties. A f...

Full description

Bibliographic Details
Main Authors: Wenying LI, Xiangling WANG, Huanhuan FAN, Hongxia FAN, Jie FENG
Format: Article
Language:zho
Published: Editorial Office of Journal of China Coal Society 2024-03-01
Series:Meitan xuebao
Subjects:
Online Access:http://www.mtxb.com.cn/article/doi/10.13225/j.cnki.jccs.2023.1701
_version_ 1797230623565283328
author Wenying LI
Xiangling WANG
Huanhuan FAN
Hongxia FAN
Jie FENG
author_facet Wenying LI
Xiangling WANG
Huanhuan FAN
Hongxia FAN
Jie FENG
author_sort Wenying LI
collection DOAJ
description A comprehensive understanding of the composition and physicochemical properties of coal-based liquids, such as coal tar or coal direct liquefaction oil, is conducive to the rapid development of multi-purpose, high-performance and high-value-added products and the efficient use of oil properties. A full understanding of the composition of ideal components in the coal-based liquid mixtures and their physical and chemical properties is also the key to designing liquid fuels with some special properties. The authors use the RDKit toolkit in Python, a method based on the Simplified Molecular Input Specification for Molecules (SMILES) language, to construct the molecular descriptors suitable for substances in the coal-based liquids. The constructed molecular descriptors are able to extract the required structural fragments for the molecules in the coal-based liquids, which are mainly composed of the elements C, H, O, N, and S and contain a large number of substances with polycyclic aromatic structures, so the constructed structural fragment descriptors are mainly considered from the perspective of the elemental and ring numbers of the polycyclic aromatic compounds. At the same time, the number of atoms and the molecular weight descriptors are added to the structural fragment descriptors, which the number of molecular descriptors is 115 in total. Compared with the traditional manual information extraction methods, the constructed molecular descriptors can quickly extract the information contained in a large number of molecules in the coal-based liquids. The structural fragments, molecular weights and atomic numbers of the coal-based liquid molecules obtained by the constructed molecular descriptors are used as input features in Machine Learning (ML) to establish a method of predicting the quantitative molecular structure-property relationship (ML-QSPR) of the coal-based liquids, which achieves the fast and accurate prediction of four properties, namely, the lower heating value (LHV), the density of the liquid (ρ), the flash point (FP) and the cetane number (CN). The model validation analysis shows that the model R2 of LHV, ρ, and FP are 0.996, 0.988, and 0.987, respectively. The CN prediction is predicted by adding mixtures, and the R2=0.959. The ML-QSPR method has been improved in terms of prediction accuracy compared to the methods in the literatures and has a significant advantage over the traditional experimental methods in terms of the speed of obtaining properties. Using the information in the property database obtained from the ML-QSPR predictions, the evolution of four combustion performance parameters of different groups of substances is investigated when the number of carbon atoms is increased, and all four properties are significantly affected by the carbon number (n). Comparison of the individual properties of substances of different families shows that the difference in the LHV of substances of different families is small, and the size of LHV is mainly determined by n. For ρ, FP and CN, the difference in the properties of substances of different families is obvious. The trained model can be used to predict new molecules for new fuel design. The ML-QSPR method is expected to be used as a transfer learning model for the property analysis of different coal-based liquids when being applied in other application scenarios.
first_indexed 2024-04-24T15:31:26Z
format Article
id doaj.art-ae44913c0d3f4af1af081986393807b6
institution Directory Open Access Journal
issn 0253-9993
language zho
last_indexed 2024-04-24T15:31:26Z
publishDate 2024-03-01
publisher Editorial Office of Journal of China Coal Society
record_format Article
series Meitan xuebao
spelling doaj.art-ae44913c0d3f4af1af081986393807b62024-04-02T04:00:38ZzhoEditorial Office of Journal of China Coal SocietyMeitan xuebao0253-99932024-03-014921098111010.13225/j.cnki.jccs.2023.17012023-1701Predicting the fuel performance of coal-based liquids using the ML-QSPR methodWenying LI0Xiangling WANG1Huanhuan FAN2Hongxia FAN3Jie FENG4State Key Laboratory of Clean and Efficient Coal Utilization, Taiyuan University of Technology, Taiyuan 030024, ChinaState Key Laboratory of Clean and Efficient Coal Utilization, Taiyuan University of Technology, Taiyuan 030024, ChinaState Key Laboratory of Clean and Efficient Coal Utilization, Taiyuan University of Technology, Taiyuan 030024, ChinaState Key Laboratory of Clean and Efficient Coal Utilization, Taiyuan University of Technology, Taiyuan 030024, ChinaState Key Laboratory of Clean and Efficient Coal Utilization, Taiyuan University of Technology, Taiyuan 030024, ChinaA comprehensive understanding of the composition and physicochemical properties of coal-based liquids, such as coal tar or coal direct liquefaction oil, is conducive to the rapid development of multi-purpose, high-performance and high-value-added products and the efficient use of oil properties. A full understanding of the composition of ideal components in the coal-based liquid mixtures and their physical and chemical properties is also the key to designing liquid fuels with some special properties. The authors use the RDKit toolkit in Python, a method based on the Simplified Molecular Input Specification for Molecules (SMILES) language, to construct the molecular descriptors suitable for substances in the coal-based liquids. The constructed molecular descriptors are able to extract the required structural fragments for the molecules in the coal-based liquids, which are mainly composed of the elements C, H, O, N, and S and contain a large number of substances with polycyclic aromatic structures, so the constructed structural fragment descriptors are mainly considered from the perspective of the elemental and ring numbers of the polycyclic aromatic compounds. At the same time, the number of atoms and the molecular weight descriptors are added to the structural fragment descriptors, which the number of molecular descriptors is 115 in total. Compared with the traditional manual information extraction methods, the constructed molecular descriptors can quickly extract the information contained in a large number of molecules in the coal-based liquids. The structural fragments, molecular weights and atomic numbers of the coal-based liquid molecules obtained by the constructed molecular descriptors are used as input features in Machine Learning (ML) to establish a method of predicting the quantitative molecular structure-property relationship (ML-QSPR) of the coal-based liquids, which achieves the fast and accurate prediction of four properties, namely, the lower heating value (LHV), the density of the liquid (ρ), the flash point (FP) and the cetane number (CN). The model validation analysis shows that the model R2 of LHV, ρ, and FP are 0.996, 0.988, and 0.987, respectively. The CN prediction is predicted by adding mixtures, and the R2=0.959. The ML-QSPR method has been improved in terms of prediction accuracy compared to the methods in the literatures and has a significant advantage over the traditional experimental methods in terms of the speed of obtaining properties. Using the information in the property database obtained from the ML-QSPR predictions, the evolution of four combustion performance parameters of different groups of substances is investigated when the number of carbon atoms is increased, and all four properties are significantly affected by the carbon number (n). Comparison of the individual properties of substances of different families shows that the difference in the LHV of substances of different families is small, and the size of LHV is mainly determined by n. For ρ, FP and CN, the difference in the properties of substances of different families is obvious. The trained model can be used to predict new molecules for new fuel design. The ML-QSPR method is expected to be used as a transfer learning model for the property analysis of different coal-based liquids when being applied in other application scenarios.http://www.mtxb.com.cn/article/doi/10.13225/j.cnki.jccs.2023.1701coal tarliquids from direct coal liquefactioncoal structurecoal compositionmolecular descriptors
spellingShingle Wenying LI
Xiangling WANG
Huanhuan FAN
Hongxia FAN
Jie FENG
Predicting the fuel performance of coal-based liquids using the ML-QSPR method
Meitan xuebao
coal tar
liquids from direct coal liquefaction
coal structure
coal composition
molecular descriptors
title Predicting the fuel performance of coal-based liquids using the ML-QSPR method
title_full Predicting the fuel performance of coal-based liquids using the ML-QSPR method
title_fullStr Predicting the fuel performance of coal-based liquids using the ML-QSPR method
title_full_unstemmed Predicting the fuel performance of coal-based liquids using the ML-QSPR method
title_short Predicting the fuel performance of coal-based liquids using the ML-QSPR method
title_sort predicting the fuel performance of coal based liquids using the ml qspr method
topic coal tar
liquids from direct coal liquefaction
coal structure
coal composition
molecular descriptors
url http://www.mtxb.com.cn/article/doi/10.13225/j.cnki.jccs.2023.1701
work_keys_str_mv AT wenyingli predictingthefuelperformanceofcoalbasedliquidsusingthemlqsprmethod
AT xianglingwang predictingthefuelperformanceofcoalbasedliquidsusingthemlqsprmethod
AT huanhuanfan predictingthefuelperformanceofcoalbasedliquidsusingthemlqsprmethod
AT hongxiafan predictingthefuelperformanceofcoalbasedliquidsusingthemlqsprmethod
AT jiefeng predictingthefuelperformanceofcoalbasedliquidsusingthemlqsprmethod