Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.

This research assesses the evolution of lexical diversity in scholarly titles using a new indicator based on zipfian frequency-rank distribution tail fits. At the operational level, while both head and tail fits of zipfian word distributions are more independent of corpus size than other lexical div...

Full description

Bibliographic Details
Main Authors: Nicolas Bérubé, Maxime Sainte-Marie, Philippe Mongeon, Vincent Larivière
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC6037356?pdf=render
_version_ 1811321931795791872
author Nicolas Bérubé
Maxime Sainte-Marie
Philippe Mongeon
Vincent Larivière
author_facet Nicolas Bérubé
Maxime Sainte-Marie
Philippe Mongeon
Vincent Larivière
author_sort Nicolas Bérubé
collection DOAJ
description This research assesses the evolution of lexical diversity in scholarly titles using a new indicator based on zipfian frequency-rank distribution tail fits. At the operational level, while both head and tail fits of zipfian word distributions are more independent of corpus size than other lexical diversity indicators, the latter however neatly outperforms the former in that regard. This benchmark-setting performance of zipfian distribution tails proves extremely handy in distinguishing actual patterns in lexical diversity from the statistical noise generated by other indicators due to corpus size fluctuations. From an empirical perspective, analysis of Web of Science (WoS) article titles from 1975 to 2014 shows that the lexical concentration of scholarly titles in Natural Sciences & Engineering (NSE) and Social Sciences & Humanities (SSH) articles increases by a little less than 8% over the whole period. With the exception of the lexically concentrated Mathematics, Earth & Space, and Physics, NSE article titles all increased in lexical concentration, suggesting a probable convergence of concentration levels in the near future. As regards to SSH disciplines, aggregation effects observed at the disciplinary group level suggests that, behind the stable concentration levels of SSH disciplines, a cross-disciplinary homogenization of the highest word frequency ranks may be at work. Overall, these trends suggest a progressive standardization of title wording in scientific article titles, as article titles get written using an increasingly restricted and cross-disciplinary set of words.
first_indexed 2024-04-13T13:26:18Z
format Article
id doaj.art-8b9b90e8bdb54f579c5b68c6910fb7d9
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-13T13:26:18Z
publishDate 2018-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-8b9b90e8bdb54f579c5b68c6910fb7d92022-12-22T02:45:07ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01137e019777510.1371/journal.pone.0197775Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.Nicolas BérubéMaxime Sainte-MariePhilippe MongeonVincent LarivièreThis research assesses the evolution of lexical diversity in scholarly titles using a new indicator based on zipfian frequency-rank distribution tail fits. At the operational level, while both head and tail fits of zipfian word distributions are more independent of corpus size than other lexical diversity indicators, the latter however neatly outperforms the former in that regard. This benchmark-setting performance of zipfian distribution tails proves extremely handy in distinguishing actual patterns in lexical diversity from the statistical noise generated by other indicators due to corpus size fluctuations. From an empirical perspective, analysis of Web of Science (WoS) article titles from 1975 to 2014 shows that the lexical concentration of scholarly titles in Natural Sciences & Engineering (NSE) and Social Sciences & Humanities (SSH) articles increases by a little less than 8% over the whole period. With the exception of the lexically concentrated Mathematics, Earth & Space, and Physics, NSE article titles all increased in lexical concentration, suggesting a probable convergence of concentration levels in the near future. As regards to SSH disciplines, aggregation effects observed at the disciplinary group level suggests that, behind the stable concentration levels of SSH disciplines, a cross-disciplinary homogenization of the highest word frequency ranks may be at work. Overall, these trends suggest a progressive standardization of title wording in scientific article titles, as article titles get written using an increasingly restricted and cross-disciplinary set of words.http://europepmc.org/articles/PMC6037356?pdf=render
spellingShingle Nicolas Bérubé
Maxime Sainte-Marie
Philippe Mongeon
Vincent Larivière
Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.
PLoS ONE
title Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.
title_full Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.
title_fullStr Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.
title_full_unstemmed Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.
title_short Words by the tail: Assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits.
title_sort words by the tail assessing lexical diversity in scholarly titles using frequency rank distribution tail fits
url http://europepmc.org/articles/PMC6037356?pdf=render
work_keys_str_mv AT nicolasberube wordsbythetailassessinglexicaldiversityinscholarlytitlesusingfrequencyrankdistributiontailfits
AT maximesaintemarie wordsbythetailassessinglexicaldiversityinscholarlytitlesusingfrequencyrankdistributiontailfits
AT philippemongeon wordsbythetailassessinglexicaldiversityinscholarlytitlesusingfrequencyrankdistributiontailfits
AT vincentlariviere wordsbythetailassessinglexicaldiversityinscholarlytitlesusingfrequencyrankdistributiontailfits