Summary: | Abstract In manufacturing, the technology to capture and store large volumes of data developed earlier and faster than corresponding capabilities to analyze, interpret, and apply it. The result for many manufacturers is a collection of unanalyzed data and uncertainty with respect to where to begin. This paper examines big data as both an enabler and a challenge for the connected manufacturing enterprise and presents a framework that sequentially tests and selects independent variables for training applied machine learning models. Unsuitable features are discarded, and each remaining feature receives a crisp numeric output and a linguistic label, both of which are measures of the feature’s suitability. The framework is tested using three datasets employing time series, binary, and continuous input data. Results of filtered models are compared to results obtained by base, unfiltered sets of features using a proposed metric of performance-size ratio. Framework results outperform base feature sets in all tested cases, and the proposed future research will be to implement it in a case study in the electronic assembly manufacture.
|