Clustering of Similar Incident Tickets Using Natural Language Processing

As businesses increasingly rely on digital tools for operational efficiency and value creation, Software Asset Management (SAM) becomes an important business practice. This thesis explores the use of natural language processing (NLP) and clustering algorithms to identify recurring issues affecting s...

Full description

Bibliographic Details
Main Author:	Chen, Jackie
Other Authors:	Lykouris, Thodoris
Format:	Thesis
Published:	Massachusetts Institute of Technology 2024
Online Access:	https://hdl.handle.net/1721.1/155983

_version_	1826206953885401088
author	Chen, Jackie
author2	Lykouris, Thodoris
author_facet	Lykouris, Thodoris Chen, Jackie
author_sort	Chen, Jackie
collection	MIT
description	As businesses increasingly rely on digital tools for operational efficiency and value creation, Software Asset Management (SAM) becomes an important business practice. This thesis explores the use of natural language processing (NLP) and clustering algorithms to identify recurring issues affecting software applications with the objectives to assess the technical health of applications and to identify opportunities to address software issues that repeatedly plague users. Using a dataset of incident tickets from a business unit of a pharmaceutical company, various machine learning models were designed and tested to identify recurring issues affecting the business' applications. Through a dashboard that visualizes the outputs of the models, the business is provided with insights into recurring issues affecting their digital tools. As validated through user feedback and visual inspection, the model outputs indicate promising results in the clustering of incident tickets, offering valuable insights to users to understand and address recurrent software problems. However, it is important to acknowledge the inherent challenges of unsupervised machine learning. While the results can help enhance business operations, caution is advised regarding the implications to users and the business when models produce unexpected results. This project is another example of the balance between leveraging machine learning for problem-solving and understanding the limitations of the models.
first_indexed	2024-09-23T13:41:11Z
format	Thesis
id	mit-1721.1/155983
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T13:41:11Z
publishDate	2024
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1559832024-08-13T03:25:35Z Clustering of Similar Incident Tickets Using Natural Language Processing Chen, Jackie Lykouris, Thodoris Daniel, Luca Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Sloan School of Management As businesses increasingly rely on digital tools for operational efficiency and value creation, Software Asset Management (SAM) becomes an important business practice. This thesis explores the use of natural language processing (NLP) and clustering algorithms to identify recurring issues affecting software applications with the objectives to assess the technical health of applications and to identify opportunities to address software issues that repeatedly plague users. Using a dataset of incident tickets from a business unit of a pharmaceutical company, various machine learning models were designed and tested to identify recurring issues affecting the business' applications. Through a dashboard that visualizes the outputs of the models, the business is provided with insights into recurring issues affecting their digital tools. As validated through user feedback and visual inspection, the model outputs indicate promising results in the clustering of incident tickets, offering valuable insights to users to understand and address recurrent software problems. However, it is important to acknowledge the inherent challenges of unsupervised machine learning. While the results can help enhance business operations, caution is advised regarding the implications to users and the business when models produce unexpected results. This project is another example of the balance between leveraging machine learning for problem-solving and understanding the limitations of the models. M.B.A. S.M. 2024-08-12T14:13:09Z 2024-08-12T14:13:09Z 2024-05 2024-06-25T18:10:23.391Z Thesis https://hdl.handle.net/1721.1/155983 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Chen, Jackie Clustering of Similar Incident Tickets Using Natural Language Processing
title	Clustering of Similar Incident Tickets Using Natural Language Processing
title_full	Clustering of Similar Incident Tickets Using Natural Language Processing
title_fullStr	Clustering of Similar Incident Tickets Using Natural Language Processing
title_full_unstemmed	Clustering of Similar Incident Tickets Using Natural Language Processing
title_short	Clustering of Similar Incident Tickets Using Natural Language Processing
title_sort	clustering of similar incident tickets using natural language processing
url	https://hdl.handle.net/1721.1/155983
work_keys_str_mv	AT chenjackie clusteringofsimilarincidentticketsusingnaturallanguageprocessing

Clustering of Similar Incident Tickets Using Natural Language Processing

Similar Items