Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.

Emotion lexicons became a popular method for quantifying affect in large amounts of textual data (e.g., social media posts). There are multiple independently developed emotion lexicons which tend to correlate positively with one another but not entirely. Such differences between lexicons may not mat...

Full description

Bibliographic Details
Main Authors: Gabriela Czarnek, David Stillwell
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0275910
_version_ 1811337544334311424
author Gabriela Czarnek
David Stillwell
author_facet Gabriela Czarnek
David Stillwell
author_sort Gabriela Czarnek
collection DOAJ
description Emotion lexicons became a popular method for quantifying affect in large amounts of textual data (e.g., social media posts). There are multiple independently developed emotion lexicons which tend to correlate positively with one another but not entirely. Such differences between lexicons may not matter if they are just unsystematic noise, but if there are systematic differences this could affect conclusions of a study. The goal of this paper is to examine whether two extensively used, apparently domain-independent lexicons for emotion analysis would give the same answer to a theory-driven research question. Specifically, we use the Linguistic Inquiry and Word Count (LIWC) and NRC Word-Emotion Association Lexicon (NRC). As an example, we investigate whether older people have more positive expression through their language use. We examined nearly 5 million tweets created by 3,573 people between 18 to 78 years old and found that both methods show an increase in positive affect until age 50. After that age, however, according to LIWC, positive affect drops sharply, whereas according to NRC, the growth of positive affect increases steadily until age 65 and then levels off. Thus, using one or the other method would lead researchers to drastically different theoretical conclusions regarding affect in older age. We unpack why the two methods give inconsistent conclusions and show this was mostly due to a particular class of words: those related to politics. We conclude that using a single lexicon might lead to unreliable conclusions, so we suggest that researchers should routinely use at least two lexicons. If both lexicons come to the same conclusion then the research evidence is reliable, but if not then researchers should further examine the lexicons to find out what difference might be causing inconclusive result.
first_indexed 2024-04-13T17:56:34Z
format Article
id doaj.art-d4f10f16dd2643018f5171184bf2be49
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-13T17:56:34Z
publishDate 2022-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-d4f10f16dd2643018f5171184bf2be492022-12-22T02:36:30ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-011710e027591010.1371/journal.pone.0275910Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.Gabriela CzarnekDavid StillwellEmotion lexicons became a popular method for quantifying affect in large amounts of textual data (e.g., social media posts). There are multiple independently developed emotion lexicons which tend to correlate positively with one another but not entirely. Such differences between lexicons may not matter if they are just unsystematic noise, but if there are systematic differences this could affect conclusions of a study. The goal of this paper is to examine whether two extensively used, apparently domain-independent lexicons for emotion analysis would give the same answer to a theory-driven research question. Specifically, we use the Linguistic Inquiry and Word Count (LIWC) and NRC Word-Emotion Association Lexicon (NRC). As an example, we investigate whether older people have more positive expression through their language use. We examined nearly 5 million tweets created by 3,573 people between 18 to 78 years old and found that both methods show an increase in positive affect until age 50. After that age, however, according to LIWC, positive affect drops sharply, whereas according to NRC, the growth of positive affect increases steadily until age 65 and then levels off. Thus, using one or the other method would lead researchers to drastically different theoretical conclusions regarding affect in older age. We unpack why the two methods give inconsistent conclusions and show this was mostly due to a particular class of words: those related to politics. We conclude that using a single lexicon might lead to unreliable conclusions, so we suggest that researchers should routinely use at least two lexicons. If both lexicons come to the same conclusion then the research evidence is reliable, but if not then researchers should further examine the lexicons to find out what difference might be causing inconclusive result.https://doi.org/10.1371/journal.pone.0275910
spellingShingle Gabriela Czarnek
David Stillwell
Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.
PLoS ONE
title Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.
title_full Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.
title_fullStr Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.
title_full_unstemmed Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.
title_short Two is better than one: Using a single emotion lexicon can lead to unreliable conclusions.
title_sort two is better than one using a single emotion lexicon can lead to unreliable conclusions
url https://doi.org/10.1371/journal.pone.0275910
work_keys_str_mv AT gabrielaczarnek twoisbetterthanoneusingasingleemotionlexiconcanleadtounreliableconclusions
AT davidstillwell twoisbetterthanoneusingasingleemotionlexiconcanleadtounreliableconclusions