A two level learning model for authorship authentication.

Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world's rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguisti...

Full description

Bibliographic Details
Main Authors: Ahmed Taha, Heba M Khalil, Tarek El-Shishtawy
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0255661
_version_ 1818694567428358144
author Ahmed Taha
Heba M Khalil
Tarek El-Shishtawy
author_facet Ahmed Taha
Heba M Khalil
Tarek El-Shishtawy
author_sort Ahmed Taha
collection DOAJ
description Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world's rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier's results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author's writing styles in numerical forms. Through this work, many new features are proposed for identifying the author's writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%.
first_indexed 2024-12-17T13:31:38Z
format Article
id doaj.art-5d3d3afe3d364aae94f29a9017460d30
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-17T13:31:38Z
publishDate 2021-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-5d3d3afe3d364aae94f29a9017460d302022-12-21T21:46:33ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01168e025566110.1371/journal.pone.0255661A two level learning model for authorship authentication.Ahmed TahaHeba M KhalilTarek El-ShishtawyNowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world's rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier's results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author's writing styles in numerical forms. Through this work, many new features are proposed for identifying the author's writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%.https://doi.org/10.1371/journal.pone.0255661
spellingShingle Ahmed Taha
Heba M Khalil
Tarek El-Shishtawy
A two level learning model for authorship authentication.
PLoS ONE
title A two level learning model for authorship authentication.
title_full A two level learning model for authorship authentication.
title_fullStr A two level learning model for authorship authentication.
title_full_unstemmed A two level learning model for authorship authentication.
title_short A two level learning model for authorship authentication.
title_sort two level learning model for authorship authentication
url https://doi.org/10.1371/journal.pone.0255661
work_keys_str_mv AT ahmedtaha atwolevellearningmodelforauthorshipauthentication
AT hebamkhalil atwolevellearningmodelforauthorshipauthentication
AT tarekelshishtawy atwolevellearningmodelforauthorshipauthentication
AT ahmedtaha twolevellearningmodelforauthorshipauthentication
AT hebamkhalil twolevellearningmodelforauthorshipauthentication
AT tarekelshishtawy twolevellearningmodelforauthorshipauthentication