Stylometric authorship balanced attribution prediction method

Stylometric authorship attribution is one of the important approaches in the text mining field that has received growing attention due to its delicateness. This approach concerns about analyzing texts such as novels and plays written by famous authors, trying to measure their writing style by choosi...

Full description

Bibliographic Details
Main Author: Mustafa, Tareef Kamil
Format: Thesis
Language:English
English
Published: 2011
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/27377/1/FSKTM%202011%2016R.pdf
_version_ 1825947115528912896
author Mustafa, Tareef Kamil
author_facet Mustafa, Tareef Kamil
author_sort Mustafa, Tareef Kamil
collection UPM
description Stylometric authorship attribution is one of the important approaches in the text mining field that has received growing attention due to its delicateness. This approach concerns about analyzing texts such as novels and plays written by famous authors, trying to measure their writing style by choosing some attributes that shows uniquely belong to the author, assuming that each author has a special artistic way of writing that no other author has. There are two major problems that tie up the progress in this field, which are the predictions accuracy results and the human expert judgment. The techniques that manage such predictions are either using the statistical attributes such as frequent words or the use of more sophisticated semantic techniques such as lexicons. Nonetheless, the results are still considerably less accurate. In this research, we propose a new Stylometric method known as the Stylometric authorship balanced attribution (SABA) that is able to overcome these problems with higher accuracy prediction and independent from human judgments, which means that the method does not rely on the domain experts. The new method is implemented by merging three methods, which are called the computational approach, the Winnow algorithm and the Burrows-delta method. The proposed method also uses a set of more effective attributes as compared to the frequent words method. This results in higher Stylometric prediction thus far, having more alibis for author artistic writing style for authorship recognition and prediction. The effective attributes are represented by the word pair and the trio, while both are multiple words attributes. The proposed SABA method is compared against three other methods using the computational approach, the Winnow algorithm method, and the Burrows-delta method. The results showed that the proposed method produces superior prediction accuracy and even provides a completely correct result during the final stage of the experiment.
first_indexed 2024-03-06T08:08:16Z
format Thesis
id upm.eprints-27377
institution Universiti Putra Malaysia
language English
English
last_indexed 2024-03-06T08:08:16Z
publishDate 2011
record_format dspace
spelling upm.eprints-273772014-02-27T00:53:54Z http://psasir.upm.edu.my/id/eprint/27377/ Stylometric authorship balanced attribution prediction method Mustafa, Tareef Kamil Stylometric authorship attribution is one of the important approaches in the text mining field that has received growing attention due to its delicateness. This approach concerns about analyzing texts such as novels and plays written by famous authors, trying to measure their writing style by choosing some attributes that shows uniquely belong to the author, assuming that each author has a special artistic way of writing that no other author has. There are two major problems that tie up the progress in this field, which are the predictions accuracy results and the human expert judgment. The techniques that manage such predictions are either using the statistical attributes such as frequent words or the use of more sophisticated semantic techniques such as lexicons. Nonetheless, the results are still considerably less accurate. In this research, we propose a new Stylometric method known as the Stylometric authorship balanced attribution (SABA) that is able to overcome these problems with higher accuracy prediction and independent from human judgments, which means that the method does not rely on the domain experts. The new method is implemented by merging three methods, which are called the computational approach, the Winnow algorithm and the Burrows-delta method. The proposed method also uses a set of more effective attributes as compared to the frequent words method. This results in higher Stylometric prediction thus far, having more alibis for author artistic writing style for authorship recognition and prediction. The effective attributes are represented by the word pair and the trio, while both are multiple words attributes. The proposed SABA method is compared against three other methods using the computational approach, the Winnow algorithm method, and the Burrows-delta method. The results showed that the proposed method produces superior prediction accuracy and even provides a completely correct result during the final stage of the experiment. 2011-08 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/27377/1/FSKTM%202011%2016R.pdf Mustafa, Tareef Kamil (2011) Stylometric authorship balanced attribution prediction method. PhD thesis, Universiti Putra Malaysia. Text processing (Computer science) Authorship - Style manuals Prediction (Logic) English
spellingShingle Text processing (Computer science)
Authorship - Style manuals
Prediction (Logic)
Mustafa, Tareef Kamil
Stylometric authorship balanced attribution prediction method
title Stylometric authorship balanced attribution prediction method
title_full Stylometric authorship balanced attribution prediction method
title_fullStr Stylometric authorship balanced attribution prediction method
title_full_unstemmed Stylometric authorship balanced attribution prediction method
title_short Stylometric authorship balanced attribution prediction method
title_sort stylometric authorship balanced attribution prediction method
topic Text processing (Computer science)
Authorship - Style manuals
Prediction (Logic)
url http://psasir.upm.edu.my/id/eprint/27377/1/FSKTM%202011%2016R.pdf
work_keys_str_mv AT mustafatareefkamil stylometricauthorshipbalancedattributionpredictionmethod