Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques
Data mining and knowledge discovery (DMKD) focuses on extracting useful information from data. In the chemical process industry, tasks such as process monitoring, fault detection, process control, optimization, etc., can be achieved using DMKD. However, the selection of the appropriate method for ea...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Cambridge University Press
2022-01-01
|
Series: | Data-Centric Engineering |
Subjects: | |
Online Access: | https://www.cambridge.org/core/product/identifier/S2632673622000211/type/journal_article |
_version_ | 1811156423967506432 |
---|---|
author | Luis A. Briceno-Mena Miriam Nnadili Michael G. Benton Jose A. Romagnoli |
author_facet | Luis A. Briceno-Mena Miriam Nnadili Michael G. Benton Jose A. Romagnoli |
author_sort | Luis A. Briceno-Mena |
collection | DOAJ |
description | Data mining and knowledge discovery (DMKD) focuses on extracting useful information from data. In the chemical process industry, tasks such as process monitoring, fault detection, process control, optimization, etc., can be achieved using DMKD. However, the selection of the appropriate method for each step in the DMKD process, namely data cleaning, sampling, scaling, dimensionality reduction (DR), clustering, clustering analysis and data visualization to obtain meaningful insights is far from trivial. In this contribution, a computational environment (FastMan) is introduced and used to illustrate how method selection affects DMKD in chemical process data. Two case studies, using data from a simulated natural gas liquid plant and real data from an industrial pyrolysis unit, were conducted to demonstrate the applicability of these methodologies in real-life scenarios. Sampling and normalization methods were found to have a great impact on the quality of the DMKD results. Also, a neighbor graphs method for DR, t-distributed stochastic neighbor embedding, outperformed principal component analysis, a matrix factorization method frequently used in the chemical process industry for identifying both local and global changes. |
first_indexed | 2024-04-10T04:51:20Z |
format | Article |
id | doaj.art-3fd963e292fe42ffafcb0a09f115814f |
institution | Directory Open Access Journal |
issn | 2632-6736 |
language | English |
last_indexed | 2024-04-10T04:51:20Z |
publishDate | 2022-01-01 |
publisher | Cambridge University Press |
record_format | Article |
series | Data-Centric Engineering |
spelling | doaj.art-3fd963e292fe42ffafcb0a09f115814f2023-03-09T12:31:51ZengCambridge University PressData-Centric Engineering2632-67362022-01-01310.1017/dce.2022.21Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniquesLuis A. Briceno-Mena0https://orcid.org/0000-0003-3684-4232Miriam Nnadili1Michael G. Benton2Jose A. Romagnoli3Cain Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803, USACain Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803, USACain Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803, USACain Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803, USAData mining and knowledge discovery (DMKD) focuses on extracting useful information from data. In the chemical process industry, tasks such as process monitoring, fault detection, process control, optimization, etc., can be achieved using DMKD. However, the selection of the appropriate method for each step in the DMKD process, namely data cleaning, sampling, scaling, dimensionality reduction (DR), clustering, clustering analysis and data visualization to obtain meaningful insights is far from trivial. In this contribution, a computational environment (FastMan) is introduced and used to illustrate how method selection affects DMKD in chemical process data. Two case studies, using data from a simulated natural gas liquid plant and real data from an industrial pyrolysis unit, were conducted to demonstrate the applicability of these methodologies in real-life scenarios. Sampling and normalization methods were found to have a great impact on the quality of the DMKD results. Also, a neighbor graphs method for DR, t-distributed stochastic neighbor embedding, outperformed principal component analysis, a matrix factorization method frequently used in the chemical process industry for identifying both local and global changes.https://www.cambridge.org/core/product/identifier/S2632673622000211/type/journal_articleData miningknowledge discoverymachine learningprocess monitoringunsupervised learning |
spellingShingle | Luis A. Briceno-Mena Miriam Nnadili Michael G. Benton Jose A. Romagnoli Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques Data-Centric Engineering Data mining knowledge discovery machine learning process monitoring unsupervised learning |
title | Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques |
title_full | Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques |
title_fullStr | Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques |
title_full_unstemmed | Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques |
title_short | Data mining and knowledge discovery in chemical processes: Effect of alternative processing techniques |
title_sort | data mining and knowledge discovery in chemical processes effect of alternative processing techniques |
topic | Data mining knowledge discovery machine learning process monitoring unsupervised learning |
url | https://www.cambridge.org/core/product/identifier/S2632673622000211/type/journal_article |
work_keys_str_mv | AT luisabricenomena dataminingandknowledgediscoveryinchemicalprocesseseffectofalternativeprocessingtechniques AT miriamnnadili dataminingandknowledgediscoveryinchemicalprocesseseffectofalternativeprocessingtechniques AT michaelgbenton dataminingandknowledgediscoveryinchemicalprocesseseffectofalternativeprocessingtechniques AT josearomagnoli dataminingandknowledgediscoveryinchemicalprocesseseffectofalternativeprocessingtechniques |