Semantic knowledge management system for design documentation with heterogeneous data using machine learning

Design documentation is presumed to contain massive amounts of valuable information and expert knowledge that is useful for learning from the past successes and failures. However, the current practice of documenting design in most industries does not result in big data that can support a true digita...

Full description

Bibliographic Details
Main Authors: Gammack, Jack, Akay, Haluk, Ceylan, Ceylan, Kim, Sang-Gook
Other Authors: Massachusetts Institute of Technology. Department of Mechanical Engineering
Format: Article
Language:English
Published: Elsevier BV 2024
Subjects:
Online Access:https://hdl.handle.net/1721.1/153622
_version_ 1811082116006412288
author Gammack, Jack
Akay, Haluk
Ceylan, Ceylan
Kim, Sang-Gook
author2 Massachusetts Institute of Technology. Department of Mechanical Engineering
author_facet Massachusetts Institute of Technology. Department of Mechanical Engineering
Gammack, Jack
Akay, Haluk
Ceylan, Ceylan
Kim, Sang-Gook
author_sort Gammack, Jack
collection MIT
description Design documentation is presumed to contain massive amounts of valuable information and expert knowledge that is useful for learning from the past successes and failures. However, the current practice of documenting design in most industries does not result in big data that can support a true digital transformation of enterprise. Very little information on concepts and decisions in early product design has been digitally captured, and the access and retrieval of them via taxonomy-based knowledge management systems are very challenging because most rule-based classification and search systems cannot concurrently process heterogeneous data (text, figures, tables, references). When experts retire or leave a design unit, industry often cannot benefit from past knowledge for future product design, and is left to reinvent the wheel repeatedly. In this work, we present AI-based Natural Language Processing (NLP) models which are trained for contextually representing technical documents containing texts, figures and tables, to do a semantic search for the retrieval of relevant data across large corpora of documents. By connecting textual and non-textual data through the use of an associative database, the semantic search question-answering system we developed can provide more comprehensive answers in the context of users’ questions. For the demonstration and assessment of this model, the semantic search question-answering system is applied to the Intergovernmental Panel on Climate Change (IPCC) Special Report 2019, which is more than 600 pages long and difficult to read and understand, even by most experts. Users can input custom queries relating to climate change concerns and receive evidence from the report that is contextually meaningful. We expect this method can transform current repositories of design documentation of heterogeneous data forms into structured knowledge-bases which can return relevant information efficiently as well as can evolve to embody manageable big data for the true digital transformation of design.
first_indexed 2024-09-23T11:57:49Z
format Article
id mit-1721.1/153622
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T11:57:49Z
publishDate 2024
publisher Elsevier BV
record_format dspace
spelling mit-1721.1/1536222024-09-19T15:18:47Z Semantic knowledge management system for design documentation with heterogeneous data using machine learning Gammack, Jack Akay, Haluk Ceylan, Ceylan Kim, Sang-Gook Massachusetts Institute of Technology. Department of Mechanical Engineering General Medicine Design documentation is presumed to contain massive amounts of valuable information and expert knowledge that is useful for learning from the past successes and failures. However, the current practice of documenting design in most industries does not result in big data that can support a true digital transformation of enterprise. Very little information on concepts and decisions in early product design has been digitally captured, and the access and retrieval of them via taxonomy-based knowledge management systems are very challenging because most rule-based classification and search systems cannot concurrently process heterogeneous data (text, figures, tables, references). When experts retire or leave a design unit, industry often cannot benefit from past knowledge for future product design, and is left to reinvent the wheel repeatedly. In this work, we present AI-based Natural Language Processing (NLP) models which are trained for contextually representing technical documents containing texts, figures and tables, to do a semantic search for the retrieval of relevant data across large corpora of documents. By connecting textual and non-textual data through the use of an associative database, the semantic search question-answering system we developed can provide more comprehensive answers in the context of users’ questions. For the demonstration and assessment of this model, the semantic search question-answering system is applied to the Intergovernmental Panel on Climate Change (IPCC) Special Report 2019, which is more than 600 pages long and difficult to read and understand, even by most experts. Users can input custom queries relating to climate change concerns and receive evidence from the report that is contextually meaningful. We expect this method can transform current repositories of design documentation of heterogeneous data forms into structured knowledge-bases which can return relevant information efficiently as well as can evolve to embody manageable big data for the true digital transformation of design. 2024-02-29T21:38:39Z 2024-02-29T21:38:39Z 2022 2024-02-29T21:26:52Z Article http://purl.org/eprint/type/JournalArticle 2212-8271 https://hdl.handle.net/1721.1/153622 Gammack, Jack, Akay, Haluk, Ceylan, Ceylan and Kim, Sang-Gook. 2022. "Semantic knowledge management system for design documentation with heterogeneous data using machine learning." Procedia CIRP, 109. en 10.1016/j.procir.2022.05.220 Procedia CIRP Creative Commons Attribution-Noncommercial-No Derivatives http://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Elsevier BV Elsevier BV
spellingShingle General Medicine
Gammack, Jack
Akay, Haluk
Ceylan, Ceylan
Kim, Sang-Gook
Semantic knowledge management system for design documentation with heterogeneous data using machine learning
title Semantic knowledge management system for design documentation with heterogeneous data using machine learning
title_full Semantic knowledge management system for design documentation with heterogeneous data using machine learning
title_fullStr Semantic knowledge management system for design documentation with heterogeneous data using machine learning
title_full_unstemmed Semantic knowledge management system for design documentation with heterogeneous data using machine learning
title_short Semantic knowledge management system for design documentation with heterogeneous data using machine learning
title_sort semantic knowledge management system for design documentation with heterogeneous data using machine learning
topic General Medicine
url https://hdl.handle.net/1721.1/153622
work_keys_str_mv AT gammackjack semanticknowledgemanagementsystemfordesigndocumentationwithheterogeneousdatausingmachinelearning
AT akayhaluk semanticknowledgemanagementsystemfordesigndocumentationwithheterogeneousdatausingmachinelearning
AT ceylanceylan semanticknowledgemanagementsystemfordesigndocumentationwithheterogeneousdatausingmachinelearning
AT kimsanggook semanticknowledgemanagementsystemfordesigndocumentationwithheterogeneousdatausingmachinelearning