Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning

There is a high need for a big data repository for material compositions and their derived analytics of metal strength, in the material science community. Currently, many researchers maintain their own excel sheets, prepared manually by their team by tabulating the experimental data collected from s...

Full description

Bibliographic Details
Main Authors: Swetha Chittam, Balakrishna Gokaraju, Zhigang Xu, Jagannathan Sankar, Kaushik Roy
Format: Article
Language:English
Published: MDPI AG 2021-09-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/18/8596
_version_ 1797520324476010496
author Swetha Chittam
Balakrishna Gokaraju
Zhigang Xu
Jagannathan Sankar
Kaushik Roy
author_facet Swetha Chittam
Balakrishna Gokaraju
Zhigang Xu
Jagannathan Sankar
Kaushik Roy
author_sort Swetha Chittam
collection DOAJ
description There is a high need for a big data repository for material compositions and their derived analytics of metal strength, in the material science community. Currently, many researchers maintain their own excel sheets, prepared manually by their team by tabulating the experimental data collected from scientific journals, and analyzing the data by performing manual calculations using formulas to determine the strength of the material. In this study, we propose a big data storage for material science data and its processing parameters information to address the laborious process of data tabulation from scientific articles, data mining techniques to retrieve the information from databases to perform big data analytics, and a machine learning prediction model to determine material strength insights. Three models are proposed based on Logistic regression, Support vector Machine SVM and Random Forest Algorithms. These models are trained and tested using a 10-fold cross validation approach. The Random Forest classification model performed better on the independent dataset, with 87% accuracy in comparison to Logistic regression and SVM with 72% and 78%, respectively.
first_indexed 2024-03-10T07:55:09Z
format Article
id doaj.art-70666d519bdd4efda7b5d4df5246abe7
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T07:55:09Z
publishDate 2021-09-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-70666d519bdd4efda7b5d4df5246abe72023-11-22T11:55:14ZengMDPI AGApplied Sciences2076-34172021-09-011118859610.3390/app11188596Big Data Mining and Classification of Intelligent Material Science Data Using Machine LearningSwetha Chittam0Balakrishna Gokaraju1Zhigang Xu2Jagannathan Sankar3Kaushik Roy4Department of Computer Science, College of Engineering, North Carolina A&T University, 1601 E. Market Street, Greensboro, NC 27411, USAEngineering Research Center & Center for Visualization and Computation Advancing Research (ViCAR), Department of Computational Data Science and Engineering, College of Engineering, North Carolina A&T University, 1601 E. Market Street, Greensboro, NC 27411, USADepartment of Mechanical Engineering & Engineering Research Center, College of Engineering, North Carolina A&T University, 1601 E. Market Street, Greensboro, NC 27411, USADepartment of Mechanical Engineering & Engineering Research Center, College of Engineering, North Carolina A&T University, 1601 E. Market Street, Greensboro, NC 27411, USADepartment of Computer Science, College of Engineering, North Carolina A&T University, 1601 E. Market Street, Greensboro, NC 27411, USAThere is a high need for a big data repository for material compositions and their derived analytics of metal strength, in the material science community. Currently, many researchers maintain their own excel sheets, prepared manually by their team by tabulating the experimental data collected from scientific journals, and analyzing the data by performing manual calculations using formulas to determine the strength of the material. In this study, we propose a big data storage for material science data and its processing parameters information to address the laborious process of data tabulation from scientific articles, data mining techniques to retrieve the information from databases to perform big data analytics, and a machine learning prediction model to determine material strength insights. Three models are proposed based on Logistic regression, Support vector Machine SVM and Random Forest Algorithms. These models are trained and tested using a 10-fold cross validation approach. The Random Forest classification model performed better on the independent dataset, with 87% accuracy in comparison to Logistic regression and SVM with 72% and 78%, respectively.https://www.mdpi.com/2076-3417/11/18/8596data miningmongodbNo-SQL databaseclassification algorithmslogistic regressionsupport vector machine SVM
spellingShingle Swetha Chittam
Balakrishna Gokaraju
Zhigang Xu
Jagannathan Sankar
Kaushik Roy
Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning
Applied Sciences
data mining
mongodb
No-SQL database
classification algorithms
logistic regression
support vector machine SVM
title Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning
title_full Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning
title_fullStr Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning
title_full_unstemmed Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning
title_short Big Data Mining and Classification of Intelligent Material Science Data Using Machine Learning
title_sort big data mining and classification of intelligent material science data using machine learning
topic data mining
mongodb
No-SQL database
classification algorithms
logistic regression
support vector machine SVM
url https://www.mdpi.com/2076-3417/11/18/8596
work_keys_str_mv AT swethachittam bigdataminingandclassificationofintelligentmaterialsciencedatausingmachinelearning
AT balakrishnagokaraju bigdataminingandclassificationofintelligentmaterialsciencedatausingmachinelearning
AT zhigangxu bigdataminingandclassificationofintelligentmaterialsciencedatausingmachinelearning
AT jagannathansankar bigdataminingandclassificationofintelligentmaterialsciencedatausingmachinelearning
AT kaushikroy bigdataminingandclassificationofintelligentmaterialsciencedatausingmachinelearning