Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus

Social media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” hi...

Full description

Bibliographic Details
Main Authors: Alejandro García-Rudolph, David Sanchez-Pinsach, Dietmar Frey, Eloy Opisso, Katryna Cisek, John D. Kelleher
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/11/6713
_version_ 1827739698246713344
author Alejandro García-Rudolph
David Sanchez-Pinsach
Dietmar Frey
Eloy Opisso
Katryna Cisek
John D. Kelleher
author_facet Alejandro García-Rudolph
David Sanchez-Pinsach
Dietmar Frey
Eloy Opisso
Katryna Cisek
John D. Kelleher
author_sort Alejandro García-Rudolph
collection DOAJ
description Social media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” highlighting the importance of context in NLP. Meanwhile, “Context is everything in Emotion Research.” Therefore, we aimed to train a model (W2V) for generating word associations (also known as embeddings) using a popular Coronavirus Reddit forum, validate them using public evidence and apply them to the discovery of context for specific emotions previously reported as related to psychological resilience. We used Pushshiftr, quanteda, broom, wordVectors, and superheat R packages. We collected all 374,421 posts submitted by 104,351 users to Reddit/Coronavirus forum between January 2020 and July 2021. W2V identified 64 terms representing the context for seven positive emotions (gratitude, compassion, love, relief, hope, calm, and admiration) and 52 terms for seven negative emotions (anger, loneliness, boredom, fear, anxiety, confusion, sadness) all from valid experienced situations. We clustered them visually, highlighting contextual similarity. Although trained on a “small” dataset, W2V can be used for context discovery to expand on concepts such as psychological resilience.
first_indexed 2024-03-11T03:11:46Z
format Article
id doaj.art-bca650bfd367486098086b7675f71986
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T03:11:46Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-bca650bfd367486098086b7675f719862023-11-18T07:35:45ZengMDPI AGApplied Sciences2076-34172023-05-011311671310.3390/app13116713Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/CoronavirusAlejandro García-Rudolph0David Sanchez-Pinsach1Dietmar Frey2Eloy Opisso3Katryna Cisek4John D. Kelleher5Department of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació Adscrit a la UAB, 08027 Badalona, SpainDepartment of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació Adscrit a la UAB, 08027 Badalona, SpainCLAIM Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, 10117 Berlin, GermanyDepartment of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació Adscrit a la UAB, 08027 Badalona, SpainInformation, Communication and Entertainment Research Institute, Technological University Dublin (TU Dublin), D7 EWV4 Dublin, IrelandInformation, Communication and Entertainment Research Institute, Technological University Dublin (TU Dublin), D7 EWV4 Dublin, IrelandSocial media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” highlighting the importance of context in NLP. Meanwhile, “Context is everything in Emotion Research.” Therefore, we aimed to train a model (W2V) for generating word associations (also known as embeddings) using a popular Coronavirus Reddit forum, validate them using public evidence and apply them to the discovery of context for specific emotions previously reported as related to psychological resilience. We used Pushshiftr, quanteda, broom, wordVectors, and superheat R packages. We collected all 374,421 posts submitted by 104,351 users to Reddit/Coronavirus forum between January 2020 and July 2021. W2V identified 64 terms representing the context for seven positive emotions (gratitude, compassion, love, relief, hope, calm, and admiration) and 52 terms for seven negative emotions (anger, loneliness, boredom, fear, anxiety, confusion, sadness) all from valid experienced situations. We clustered them visually, highlighting contextual similarity. Although trained on a “small” dataset, W2V can be used for context discovery to expand on concepts such as psychological resilience.https://www.mdpi.com/2076-3417/13/11/6713COVID-19social mediaRedditnatural language processingemotionsresilience
spellingShingle Alejandro García-Rudolph
David Sanchez-Pinsach
Dietmar Frey
Eloy Opisso
Katryna Cisek
John D. Kelleher
Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
Applied Sciences
COVID-19
social media
Reddit
natural language processing
emotions
resilience
title Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
title_full Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
title_fullStr Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
title_full_unstemmed Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
title_short Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
title_sort know an emotion by the company it keeps word embeddings from reddit coronavirus
topic COVID-19
social media
Reddit
natural language processing
emotions
resilience
url https://www.mdpi.com/2076-3417/13/11/6713
work_keys_str_mv AT alejandrogarciarudolph knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus
AT davidsanchezpinsach knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus
AT dietmarfrey knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus
AT eloyopisso knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus
AT katrynacisek knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus
AT johndkelleher knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus