Summary: | With the ascension of the Information Age and the widespread use of the Internet, a plethora of knowledge exists apropos of numerous areas of interest. The resurgence of big data and machine learning has brought a high hope that designers can learn from past successes and failures. However, when the available data is in a mixture of textual, numerical or graphical form, then the currently popular deep learning tools cannot be applied directly. The question today is about the ability to represent this heterogeneous form of data and to find the relevant information from a huge depository of data in an efficient manner.
My study of data preparation is a part of a big group effort in applying Artificial Intelligence based Natural Language Processing models to large corpora of technical design documentation such as climate change reports, which then enable the retrieval of accurate information via semantic search capabilities. The methodology was able to successfully retrieve suitable answers to the user’s questions without reading hundreds of pages of reports. Additionally, the query process was able to bring up Figures and Tables that provided meaningful context to the answers via associate data-linking during the data reading and embedding process.
|