Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content

In this information age, many informal texts were generated and put up online, these user-generated-content may contain a lot of misspelled terms and abbreviations which affect people’s understanding. These cases happen often in the software engineering community website or the technology blog. The...

Full description

Bibliographic Details
Main Author: Wang, Ximing
Other Authors: Xing Zhenchang
Format: Final Year Project (FYP)
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66751
_version_ 1811680997368922112
author Wang, Ximing
author2 Xing Zhenchang
author_facet Xing Zhenchang
Wang, Ximing
author_sort Wang, Ximing
collection NTU
description In this information age, many informal texts were generated and put up online, these user-generated-content may contain a lot of misspelled terms and abbreviations which affect people’s understanding. These cases happen often in the software engineering community website or the technology blog. The various forms of the same term can be misleading. Whenever people encounter a software engineering term that they may not understand, additional searches need to be done to get the explanation. Not mentioning the query like the what library to use for a certain task, many readings, information extraction and comparison needed to be done. This project aims to shorten the time taken for look up these software engineering terms online and assist people to find the direct answer to their query. We developed a software engineering specific domain dictionary tool with the ability to retrieve the explanation and related terms from misspelled word or abbreviation and give direct answers in terms of library recommendation by utilizing the relational and analogical knowledge mined from Stack Overflow. The software engineering specific corpus is build up from a large set of unlabeled text, from which the semantics of the terms is learned and related terms are extracted, and the abbreviation and morphological forms of the terms are identified among the semantically related terms. Our solution provides a helpful way to look up the various form of software engineering domain terms and smart recommendation answer for a certain query. According to the survey we get, our software engineering dictionary tool assist the lookup and understanding of the term in an effective way, and users find this tool very useful.
first_indexed 2024-10-01T03:33:56Z
format Final Year Project (FYP)
id ntu-10356/66751
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:33:56Z
publishDate 2016
record_format dspace
spelling ntu-10356/667512023-03-03T20:38:38Z Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content Wang, Ximing Xing Zhenchang School of Computer Engineering DRNTU::Engineering In this information age, many informal texts were generated and put up online, these user-generated-content may contain a lot of misspelled terms and abbreviations which affect people’s understanding. These cases happen often in the software engineering community website or the technology blog. The various forms of the same term can be misleading. Whenever people encounter a software engineering term that they may not understand, additional searches need to be done to get the explanation. Not mentioning the query like the what library to use for a certain task, many readings, information extraction and comparison needed to be done. This project aims to shorten the time taken for look up these software engineering terms online and assist people to find the direct answer to their query. We developed a software engineering specific domain dictionary tool with the ability to retrieve the explanation and related terms from misspelled word or abbreviation and give direct answers in terms of library recommendation by utilizing the relational and analogical knowledge mined from Stack Overflow. The software engineering specific corpus is build up from a large set of unlabeled text, from which the semantics of the terms is learned and related terms are extracted, and the abbreviation and morphological forms of the terms are identified among the semantically related terms. Our solution provides a helpful way to look up the various form of software engineering domain terms and smart recommendation answer for a certain query. According to the survey we get, our software engineering dictionary tool assist the lookup and understanding of the term in an effective way, and users find this tool very useful. Bachelor of Engineering (Computer Science) 2016-04-25T03:26:32Z 2016-04-25T03:26:32Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66751 en Nanyang Technological University application/pdf
spellingShingle DRNTU::Engineering
Wang, Ximing
Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
title Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
title_full Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
title_fullStr Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
title_full_unstemmed Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
title_short Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
title_sort stack overflow data explorer dictionary tool for software engineering terms from user generated content
topic DRNTU::Engineering
url http://hdl.handle.net/10356/66751
work_keys_str_mv AT wangximing stackoverflowdataexplorerdictionarytoolforsoftwareengineeringtermsfromusergeneratedcontent