SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING
Near infrared spectroscopy (NIRS) is a widely used analytical technique for non-destructive analysis of various materials including food fraud detection. However, the accurate calibration of NIRS data can be challenging due to the complexity of the underlying relationships between the spectral data...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IIUM Press, International Islamic University Malaysia
2024-01-01
|
Series: | International Islamic University Malaysia Engineering Journal |
Subjects: | |
Online Access: | https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/2796 |
_version_ | 1827394210669527040 |
---|---|
author | Mahmud Iwan Solihin Chan Jin Yuan Wan Siu Hong Liew Phing Pui Ang Chun Kit Wafa Hossain Affiani Machmudah |
author_facet | Mahmud Iwan Solihin Chan Jin Yuan Wan Siu Hong Liew Phing Pui Ang Chun Kit Wafa Hossain Affiani Machmudah |
author_sort | Mahmud Iwan Solihin |
collection | DOAJ |
description |
Near infrared spectroscopy (NIRS) is a widely used analytical technique for non-destructive analysis of various materials including food fraud detection. However, the accurate calibration of NIRS data can be challenging due to the complexity of the underlying relationships between the spectral data and the target variables of interest. Ensemble learning, which combines multiple models to make predictions, has been shown to improve the accuracy and robustness of predictive models in various domains. This paper proposes stacking ensemble machine learning (SEML) for calibration of NIRS data with two levels of learning involved. Eight (8) spectroscopy datasets from public repository and previously published works by the authors are used as the case study. The model well generalized the data in the respective regression tasks with of at least »0.8 in the test samples and in the respective classification tasks with classification accuracy (CA) of at least »0.8 also. In addition, the proposed SEML can improve, or at least reach par with, the accuracy of individual base learners in both train and test samples for all cases of regression and classification datasets. It shows superior performance in test samples for both regression and classification datasets with respectively ranging from 0.86 to nearly 1 and CA ranging from 0.89 to 1.
ABSTRAK: Spektroskopi inframerah dekat (NIRS) adalah teknik analitikal yang banyak digunakan bagi analisa pelbagai bahan tanpa merosakkan bahan termasuk ketika mengesan penipuan makanan. Walau bagaimanapun, kalibrasi yang tepat bagi data NIRS adalah sangat mencabar kerana hubungan antara data spektral dan pemboleh ubah sasaran yang ingin dikaji bersifat kompleks. Gabungan pembelajaran (Ensemble learning), iaitu gabungan pelbagai model bagi membuat prediksi, telah terbukti dapat meningkatkan ketepatan dan kecekapan model prediksi dalam pelbagai bentuk. Kajian ini mencadangkan Turutan Gabungan Pembelajaran Mesin (Stacking Ensemble Machine Learning ) (SEML), bagi teknik penentu ukuran data NIRS melibatkan dua tahap pembelajaran. Lapan (8) set data spektroskopi dari repositori awam dan kajian terdahulu oleh pengarang telah digunakan sebagai kes kajian. Model ini menggeneralisasi data dalam tugas regresi masing-masing sebanyak ?0.8 bagi sampel ujian dan pengelasan tugas masing-masing dengan ketepatan klasifikasi (CA) sekurang-kurangnya ?0.8. Tambahan, SEML yang dicadangkan ini dapat membantu, atau sekurang-kurangnya setanding dengan ketepatan individu dalam pembelajaran berkumpulan dalam kedua-dua sampel latihan dan ujian bagi semua kes set data regresi dan klasifikasi. Ia menunjukkan prestasi terbaik dalam sampel ujian bagi kedua-dua kumpulan set data regresi dan klasifikasi dengan masing-masing antara 0.86 hingga hampir 1 dan antara julat 0.89 hingga 1 bagi CA.
|
first_indexed | 2024-03-08T18:08:17Z |
format | Article |
id | doaj.art-1fb93a9c14134066b3fdc87615a0dba2 |
institution | Directory Open Access Journal |
issn | 1511-788X 2289-7860 |
language | English |
last_indexed | 2024-03-08T18:08:17Z |
publishDate | 2024-01-01 |
publisher | IIUM Press, International Islamic University Malaysia |
record_format | Article |
series | International Islamic University Malaysia Engineering Journal |
spelling | doaj.art-1fb93a9c14134066b3fdc87615a0dba22024-01-01T11:24:05ZengIIUM Press, International Islamic University MalaysiaInternational Islamic University Malaysia Engineering Journal1511-788X2289-78602024-01-0125110.31436/iiumej.v25i1.2796SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING Mahmud Iwan Solihin0https://orcid.org/0000-0002-5293-7466Chan Jin Yuan1https://orcid.org/0000-0002-4262-1628Wan Siu Hong2https://orcid.org/0000-0002-8254-4962Liew Phing Pui3Ang Chun Kit4https://orcid.org/0000-0002-1215-909XWafa Hossain5Affiani Machmudah6UCSI UniversityUCSI University UCSI University UCSI UniversityUCSI UniversityUCSI UniversityAirlangga University Near infrared spectroscopy (NIRS) is a widely used analytical technique for non-destructive analysis of various materials including food fraud detection. However, the accurate calibration of NIRS data can be challenging due to the complexity of the underlying relationships between the spectral data and the target variables of interest. Ensemble learning, which combines multiple models to make predictions, has been shown to improve the accuracy and robustness of predictive models in various domains. This paper proposes stacking ensemble machine learning (SEML) for calibration of NIRS data with two levels of learning involved. Eight (8) spectroscopy datasets from public repository and previously published works by the authors are used as the case study. The model well generalized the data in the respective regression tasks with of at least »0.8 in the test samples and in the respective classification tasks with classification accuracy (CA) of at least »0.8 also. In addition, the proposed SEML can improve, or at least reach par with, the accuracy of individual base learners in both train and test samples for all cases of regression and classification datasets. It shows superior performance in test samples for both regression and classification datasets with respectively ranging from 0.86 to nearly 1 and CA ranging from 0.89 to 1. ABSTRAK: Spektroskopi inframerah dekat (NIRS) adalah teknik analitikal yang banyak digunakan bagi analisa pelbagai bahan tanpa merosakkan bahan termasuk ketika mengesan penipuan makanan. Walau bagaimanapun, kalibrasi yang tepat bagi data NIRS adalah sangat mencabar kerana hubungan antara data spektral dan pemboleh ubah sasaran yang ingin dikaji bersifat kompleks. Gabungan pembelajaran (Ensemble learning), iaitu gabungan pelbagai model bagi membuat prediksi, telah terbukti dapat meningkatkan ketepatan dan kecekapan model prediksi dalam pelbagai bentuk. Kajian ini mencadangkan Turutan Gabungan Pembelajaran Mesin (Stacking Ensemble Machine Learning ) (SEML), bagi teknik penentu ukuran data NIRS melibatkan dua tahap pembelajaran. Lapan (8) set data spektroskopi dari repositori awam dan kajian terdahulu oleh pengarang telah digunakan sebagai kes kajian. Model ini menggeneralisasi data dalam tugas regresi masing-masing sebanyak ?0.8 bagi sampel ujian dan pengelasan tugas masing-masing dengan ketepatan klasifikasi (CA) sekurang-kurangnya ?0.8. Tambahan, SEML yang dicadangkan ini dapat membantu, atau sekurang-kurangnya setanding dengan ketepatan individu dalam pembelajaran berkumpulan dalam kedua-dua sampel latihan dan ujian bagi semua kes set data regresi dan klasifikasi. Ia menunjukkan prestasi terbaik dalam sampel ujian bagi kedua-dua kumpulan set data regresi dan klasifikasi dengan masing-masing antara 0.86 hingga hampir 1 dan antara julat 0.89 hingga 1 bagi CA. https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/2796chemometrics calibrationensemble machine learningnear infrared spectroscopy |
spellingShingle | Mahmud Iwan Solihin Chan Jin Yuan Wan Siu Hong Liew Phing Pui Ang Chun Kit Wafa Hossain Affiani Machmudah SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING International Islamic University Malaysia Engineering Journal chemometrics calibration ensemble machine learning near infrared spectroscopy |
title | SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING |
title_full | SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING |
title_fullStr | SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING |
title_full_unstemmed | SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING |
title_short | SPECTROSCOPY DATA CALIBRATION USING STACKED ENSEMBLE MACHINE LEARNING |
title_sort | spectroscopy data calibration using stacked ensemble machine learning |
topic | chemometrics calibration ensemble machine learning near infrared spectroscopy |
url | https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/2796 |
work_keys_str_mv | AT mahmudiwansolihin spectroscopydatacalibrationusingstackedensemblemachinelearning AT chanjinyuan spectroscopydatacalibrationusingstackedensemblemachinelearning AT wansiuhong spectroscopydatacalibrationusingstackedensemblemachinelearning AT liewphingpui spectroscopydatacalibrationusingstackedensemblemachinelearning AT angchunkit spectroscopydatacalibrationusingstackedensemblemachinelearning AT wafahossain spectroscopydatacalibrationusingstackedensemblemachinelearning AT affianimachmudah spectroscopydatacalibrationusingstackedensemblemachinelearning |