A Complex Network Approach to Stylometry.

Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limi...

Full description

Bibliographic Details
Main Author: Diego Raphael Amancio
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4552030?pdf=render
_version_ 1818500384888455168
author Diego Raphael Amancio
author_facet Diego Raphael Amancio
author_sort Diego Raphael Amancio
collection DOAJ
description Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents.
first_indexed 2024-12-10T20:41:55Z
format Article
id doaj.art-5b97cd385ceb4dcb9ce7bf96671007e8
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-10T20:41:55Z
publishDate 2015-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-5b97cd385ceb4dcb9ce7bf96671007e82022-12-22T01:34:20ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01108e013607610.1371/journal.pone.0136076A Complex Network Approach to Stylometry.Diego Raphael AmancioStatistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents.http://europepmc.org/articles/PMC4552030?pdf=render
spellingShingle Diego Raphael Amancio
A Complex Network Approach to Stylometry.
PLoS ONE
title A Complex Network Approach to Stylometry.
title_full A Complex Network Approach to Stylometry.
title_fullStr A Complex Network Approach to Stylometry.
title_full_unstemmed A Complex Network Approach to Stylometry.
title_short A Complex Network Approach to Stylometry.
title_sort complex network approach to stylometry
url http://europepmc.org/articles/PMC4552030?pdf=render
work_keys_str_mv AT diegoraphaelamancio acomplexnetworkapproachtostylometry
AT diegoraphaelamancio complexnetworkapproachtostylometry