Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics

The U.S. Navy is seeking to increase the number of ships in the fleet due to growing threats, however shipyards are facing numerous issues leading to a delay in the delivery of naval warships along with cost overruns. At the same time, there is significant data available from the construction proces...

Full description

Bibliographic Details
Main Author: Willis, Heather L.
Other Authors: Gillespy, Andrew
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/155863
_version_ 1826197225792864256
author Willis, Heather L.
author2 Gillespy, Andrew
author_facet Gillespy, Andrew
Willis, Heather L.
author_sort Willis, Heather L.
collection MIT
description The U.S. Navy is seeking to increase the number of ships in the fleet due to growing threats, however shipyards are facing numerous issues leading to a delay in the delivery of naval warships along with cost overruns. At the same time, there is significant data available from the construction process, creating an opportunity for data analysis with the intention of identifying and hopefully resolving some of these issues. Addressing these concerns, this thesis scrutinizes Earned Value Management (EVM) data from actual shipbuilding projects, capitalizing on the datasets available to help identify the root causes of such delays. The study begins with data cleaning, an essential step that ensures the real-world data’s integrity and relevance. Preliminary data analysis was then conducted to explore cost variance, schedule adherence, and the learning curve effect observed across different hulls, setting the stage for deeper investigative modeling. Following model exploration and selection, the core of the thesis is a predictive model that uses polynomial and linear regression to predict the progression of costs over time and comparison to the prediction metrics currently in use. A regression model was chosen over more complex models like a long short-term memory (LSTM) neural network due to its simplicity, interpretability, and ease of retraining with new data, ensuring that stakeholders can readily understand and apply the model’s insights while maintaining its relevance over time. The target prediction metric for this model is the Actual Cost of Work Performed (ACWP), however similar models could also be leveraged to predict schedule. In creating this model, several features were analyzed including both the Budgeted Cost of Work Scheduled (BCWS) and the Budget at Completion (BAC), both known metrics at the start of construction. After testing various combinations of these features and comparing the mean squared error (MSE), the chosen model uses time and BCWS divided by BAC as input features, serving as a budgeted completion percentage. The model is tailored further to reflect industry-specific cost behaviors, enforcing non-negative, cumulative cost predictions. This model was trained, tested and validated using EVM data from one key event (KE), a specific subset of the overall ship construction process with the intent that it could be applied to all key events and aggregated to provide cost predictions for an 2entire hull. This thesis will ideally serve as a framework for shipyards to improve project cost predictions and identify indicators of large cost overruns early enough to correct them within the ship construction timeline.
first_indexed 2024-09-23T10:44:45Z
format Thesis
id mit-1721.1/155863
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:44:45Z
publishDate 2024
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1558632024-08-02T03:43:21Z Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics Willis, Heather L. Gillespy, Andrew Daniel, Luca Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Department of Mechanical Engineering The U.S. Navy is seeking to increase the number of ships in the fleet due to growing threats, however shipyards are facing numerous issues leading to a delay in the delivery of naval warships along with cost overruns. At the same time, there is significant data available from the construction process, creating an opportunity for data analysis with the intention of identifying and hopefully resolving some of these issues. Addressing these concerns, this thesis scrutinizes Earned Value Management (EVM) data from actual shipbuilding projects, capitalizing on the datasets available to help identify the root causes of such delays. The study begins with data cleaning, an essential step that ensures the real-world data’s integrity and relevance. Preliminary data analysis was then conducted to explore cost variance, schedule adherence, and the learning curve effect observed across different hulls, setting the stage for deeper investigative modeling. Following model exploration and selection, the core of the thesis is a predictive model that uses polynomial and linear regression to predict the progression of costs over time and comparison to the prediction metrics currently in use. A regression model was chosen over more complex models like a long short-term memory (LSTM) neural network due to its simplicity, interpretability, and ease of retraining with new data, ensuring that stakeholders can readily understand and apply the model’s insights while maintaining its relevance over time. The target prediction metric for this model is the Actual Cost of Work Performed (ACWP), however similar models could also be leveraged to predict schedule. In creating this model, several features were analyzed including both the Budgeted Cost of Work Scheduled (BCWS) and the Budget at Completion (BAC), both known metrics at the start of construction. After testing various combinations of these features and comparing the mean squared error (MSE), the chosen model uses time and BCWS divided by BAC as input features, serving as a budgeted completion percentage. The model is tailored further to reflect industry-specific cost behaviors, enforcing non-negative, cumulative cost predictions. This model was trained, tested and validated using EVM data from one key event (KE), a specific subset of the overall ship construction process with the intent that it could be applied to all key events and aggregated to provide cost predictions for an 2entire hull. This thesis will ideally serve as a framework for shipyards to improve project cost predictions and identify indicators of large cost overruns early enough to correct them within the ship construction timeline. S.M. Nav.E. 2024-08-01T19:02:04Z 2024-08-01T19:02:04Z 2024-05 2024-06-13T16:50:48.957Z Thesis https://hdl.handle.net/1721.1/155863 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Willis, Heather L.
Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics
title Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics
title_full Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics
title_fullStr Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics
title_full_unstemmed Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics
title_short Analysis of Data from the U.S. Shipbuilding Industry and Application to Improve Performance Metrics
title_sort analysis of data from the u s shipbuilding industry and application to improve performance metrics
url https://hdl.handle.net/1721.1/155863
work_keys_str_mv AT willisheatherl analysisofdatafromtheusshipbuildingindustryandapplicationtoimproveperformancemetrics