A two level learning model for authorship authentication

Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic kno...

Full description

Bibliographic Details
Main Authors:	Ahmed Taha, Heba M. Khalil, Tarek El-shishtawy
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2021-01-01
Series:	PLoS ONE
Online Access:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341647/?tool=EBI

_version_	1818356315193344000
author	Ahmed Taha Heba M. Khalil Tarek El-shishtawy
author_facet	Ahmed Taha Heba M. Khalil Tarek El-shishtawy
author_sort	Ahmed Taha
collection	DOAJ
description	Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier’s results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author’s writing styles in numerical forms. Through this work, many new features are proposed for identifying the author’s writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%.
first_indexed	2024-12-13T19:55:15Z
format	Article
id	doaj.art-b97e36d3593a465683b5febab36c892e
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-12-13T19:55:15Z
publishDate	2021-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-b97e36d3593a465683b5febab36c892e2022-12-21T23:33:19ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01168A two level learning model for authorship authenticationAhmed TahaHeba M. KhalilTarek El-shishtawyNowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier’s results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author’s writing styles in numerical forms. Through this work, many new features are proposed for identifying the author’s writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341647/?tool=EBI
spellingShingle	Ahmed Taha Heba M. Khalil Tarek El-shishtawy A two level learning model for authorship authentication PLoS ONE
title	A two level learning model for authorship authentication
title_full	A two level learning model for authorship authentication
title_fullStr	A two level learning model for authorship authentication
title_full_unstemmed	A two level learning model for authorship authentication
title_short	A two level learning model for authorship authentication
title_sort	two level learning model for authorship authentication
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341647/?tool=EBI
work_keys_str_mv	AT ahmedtaha atwolevellearningmodelforauthorshipauthentication AT hebamkhalil atwolevellearningmodelforauthorshipauthentication AT tarekelshishtawy atwolevellearningmodelforauthorshipauthentication AT ahmedtaha twolevellearningmodelforauthorshipauthentication AT hebamkhalil twolevellearningmodelforauthorshipauthentication AT tarekelshishtawy twolevellearningmodelforauthorshipauthentication

A two level learning model for authorship authentication

Similar Items