Enhanced bag-of-words model for web forum question post detection

A web forum is an online discussion board that connects people with common interest together. It is a problem-solving platform that has been found useful in tackling technical issues using experts across the globe. Research activities in this domain have been concentrated on answer detection with th...

Full description

Bibliographic Details
Main Authors: Obasa, Adekunle Isiaka, Salim, Naomie, Khan, Atif
Format: Conference or Workshop Item
Published: 2015
Subjects:
_version_ 1796861056672333824
author Obasa, Adekunle Isiaka
Salim, Naomie
Khan, Atif
author_facet Obasa, Adekunle Isiaka
Salim, Naomie
Khan, Atif
author_sort Obasa, Adekunle Isiaka
collection ePrints
description A web forum is an online discussion board that connects people with common interest together. It is a problem-solving platform that has been found useful in tackling technical issues using experts across the globe. Research activities in this domain have been concentrated on answer detection with the assumption that the starting post is a question post. The quality of web forum question posts varies from excellent to mediocre or even spam. Detecting good question posts require utilization of salient features. In this paper, we enhance the popular bag-of-words model with web forum metadata, simple rule of question mark and question words to mine question posts. We empirically address the following questions in the paper. Will the integration of simple rule of question mark and question words with forum metadata perform better than each of the two? Can dimensionality reduction of bag-of-words (BoW) using chi-square enhance question post detection in web forum? Can combination of BoW with simple rule of question marks, question words and forum metadata further enhance question post detection? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that an enhanced BoW can perform better than complex techniques that implement higher N-gram with part-of-speech tagging.
first_indexed 2024-03-05T19:50:36Z
format Conference or Workshop Item
id utm.eprints-61201
institution Universiti Teknologi Malaysia - ePrints
last_indexed 2024-03-05T19:50:36Z
publishDate 2015
record_format dspace
spelling utm.eprints-612012017-08-20T07:36:25Z http://eprints.utm.my/61201/ Enhanced bag-of-words model for web forum question post detection Obasa, Adekunle Isiaka Salim, Naomie Khan, Atif TK5015.888 Web sites A web forum is an online discussion board that connects people with common interest together. It is a problem-solving platform that has been found useful in tackling technical issues using experts across the globe. Research activities in this domain have been concentrated on answer detection with the assumption that the starting post is a question post. The quality of web forum question posts varies from excellent to mediocre or even spam. Detecting good question posts require utilization of salient features. In this paper, we enhance the popular bag-of-words model with web forum metadata, simple rule of question mark and question words to mine question posts. We empirically address the following questions in the paper. Will the integration of simple rule of question mark and question words with forum metadata perform better than each of the two? Can dimensionality reduction of bag-of-words (BoW) using chi-square enhance question post detection in web forum? Can combination of BoW with simple rule of question marks, question words and forum metadata further enhance question post detection? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that an enhanced BoW can perform better than complex techniques that implement higher N-gram with part-of-speech tagging. 2015 Conference or Workshop Item PeerReviewed Obasa, Adekunle Isiaka and Salim, Naomie and Khan, Atif (2015) Enhanced bag-of-words model for web forum question post detection. In: The 2nd International Conference on Soft Computing and Computational Mathematics (ICSCCM 2015), at Langkawi, Malaysia, 10-11 Dec, 2015, Langkawi, Malaysia. http://www.icsccm.com/2015/WB/
spellingShingle TK5015.888 Web sites
Obasa, Adekunle Isiaka
Salim, Naomie
Khan, Atif
Enhanced bag-of-words model for web forum question post detection
title Enhanced bag-of-words model for web forum question post detection
title_full Enhanced bag-of-words model for web forum question post detection
title_fullStr Enhanced bag-of-words model for web forum question post detection
title_full_unstemmed Enhanced bag-of-words model for web forum question post detection
title_short Enhanced bag-of-words model for web forum question post detection
title_sort enhanced bag of words model for web forum question post detection
topic TK5015.888 Web sites
work_keys_str_mv AT obasaadekunleisiaka enhancedbagofwordsmodelforwebforumquestionpostdetection
AT salimnaomie enhancedbagofwordsmodelforwebforumquestionpostdetection
AT khanatif enhancedbagofwordsmodelforwebforumquestionpostdetection