Clustering of Similar Incident Tickets Using Natural Language Processing

As businesses increasingly rely on digital tools for operational efficiency and value creation, Software Asset Management (SAM) becomes an important business practice. This thesis explores the use of natural language processing (NLP) and clustering algorithms to identify recurring issues affecting s...

Full description

Bibliographic Details
Main Author: Chen, Jackie
Other Authors: Lykouris, Thodoris
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/155983
_version_ 1826206953885401088
author Chen, Jackie
author2 Lykouris, Thodoris
author_facet Lykouris, Thodoris
Chen, Jackie
author_sort Chen, Jackie
collection MIT
description As businesses increasingly rely on digital tools for operational efficiency and value creation, Software Asset Management (SAM) becomes an important business practice. This thesis explores the use of natural language processing (NLP) and clustering algorithms to identify recurring issues affecting software applications with the objectives to assess the technical health of applications and to identify opportunities to address software issues that repeatedly plague users. Using a dataset of incident tickets from a business unit of a pharmaceutical company, various machine learning models were designed and tested to identify recurring issues affecting the business' applications. Through a dashboard that visualizes the outputs of the models, the business is provided with insights into recurring issues affecting their digital tools. As validated through user feedback and visual inspection, the model outputs indicate promising results in the clustering of incident tickets, offering valuable insights to users to understand and address recurrent software problems. However, it is important to acknowledge the inherent challenges of unsupervised machine learning. While the results can help enhance business operations, caution is advised regarding the implications to users and the business when models produce unexpected results. This project is another example of the balance between leveraging machine learning for problem-solving and understanding the limitations of the models.
first_indexed 2024-09-23T13:41:11Z
format Thesis
id mit-1721.1/155983
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T13:41:11Z
publishDate 2024
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1559832024-08-13T03:25:35Z Clustering of Similar Incident Tickets Using Natural Language Processing Chen, Jackie Lykouris, Thodoris Daniel, Luca Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Sloan School of Management As businesses increasingly rely on digital tools for operational efficiency and value creation, Software Asset Management (SAM) becomes an important business practice. This thesis explores the use of natural language processing (NLP) and clustering algorithms to identify recurring issues affecting software applications with the objectives to assess the technical health of applications and to identify opportunities to address software issues that repeatedly plague users. Using a dataset of incident tickets from a business unit of a pharmaceutical company, various machine learning models were designed and tested to identify recurring issues affecting the business' applications. Through a dashboard that visualizes the outputs of the models, the business is provided with insights into recurring issues affecting their digital tools. As validated through user feedback and visual inspection, the model outputs indicate promising results in the clustering of incident tickets, offering valuable insights to users to understand and address recurrent software problems. However, it is important to acknowledge the inherent challenges of unsupervised machine learning. While the results can help enhance business operations, caution is advised regarding the implications to users and the business when models produce unexpected results. This project is another example of the balance between leveraging machine learning for problem-solving and understanding the limitations of the models. M.B.A. S.M. 2024-08-12T14:13:09Z 2024-08-12T14:13:09Z 2024-05 2024-06-25T18:10:23.391Z Thesis https://hdl.handle.net/1721.1/155983 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Chen, Jackie
Clustering of Similar Incident Tickets Using Natural Language Processing
title Clustering of Similar Incident Tickets Using Natural Language Processing
title_full Clustering of Similar Incident Tickets Using Natural Language Processing
title_fullStr Clustering of Similar Incident Tickets Using Natural Language Processing
title_full_unstemmed Clustering of Similar Incident Tickets Using Natural Language Processing
title_short Clustering of Similar Incident Tickets Using Natural Language Processing
title_sort clustering of similar incident tickets using natural language processing
url https://hdl.handle.net/1721.1/155983
work_keys_str_mv AT chenjackie clusteringofsimilarincidentticketsusingnaturallanguageprocessing