Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus
Social media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” hi...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/11/6713 |
_version_ | 1827739698246713344 |
---|---|
author | Alejandro García-Rudolph David Sanchez-Pinsach Dietmar Frey Eloy Opisso Katryna Cisek John D. Kelleher |
author_facet | Alejandro García-Rudolph David Sanchez-Pinsach Dietmar Frey Eloy Opisso Katryna Cisek John D. Kelleher |
author_sort | Alejandro García-Rudolph |
collection | DOAJ |
description | Social media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” highlighting the importance of context in NLP. Meanwhile, “Context is everything in Emotion Research.” Therefore, we aimed to train a model (W2V) for generating word associations (also known as embeddings) using a popular Coronavirus Reddit forum, validate them using public evidence and apply them to the discovery of context for specific emotions previously reported as related to psychological resilience. We used Pushshiftr, quanteda, broom, wordVectors, and superheat R packages. We collected all 374,421 posts submitted by 104,351 users to Reddit/Coronavirus forum between January 2020 and July 2021. W2V identified 64 terms representing the context for seven positive emotions (gratitude, compassion, love, relief, hope, calm, and admiration) and 52 terms for seven negative emotions (anger, loneliness, boredom, fear, anxiety, confusion, sadness) all from valid experienced situations. We clustered them visually, highlighting contextual similarity. Although trained on a “small” dataset, W2V can be used for context discovery to expand on concepts such as psychological resilience. |
first_indexed | 2024-03-11T03:11:46Z |
format | Article |
id | doaj.art-bca650bfd367486098086b7675f71986 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T03:11:46Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-bca650bfd367486098086b7675f719862023-11-18T07:35:45ZengMDPI AGApplied Sciences2076-34172023-05-011311671310.3390/app13116713Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/CoronavirusAlejandro García-Rudolph0David Sanchez-Pinsach1Dietmar Frey2Eloy Opisso3Katryna Cisek4John D. Kelleher5Department of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació Adscrit a la UAB, 08027 Badalona, SpainDepartment of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació Adscrit a la UAB, 08027 Badalona, SpainCLAIM Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, 10117 Berlin, GermanyDepartment of Research and Innovation, Institut Guttmann, Institut Universitari de Neurorehabilitació Adscrit a la UAB, 08027 Badalona, SpainInformation, Communication and Entertainment Research Institute, Technological University Dublin (TU Dublin), D7 EWV4 Dublin, IrelandInformation, Communication and Entertainment Research Institute, Technological University Dublin (TU Dublin), D7 EWV4 Dublin, IrelandSocial media is a crucial communication tool (e.g., with 430 million monthly active users in online forums such as Reddit), being an objective of Natural Language Processing (NLP) techniques. One of them (word embeddings) is based on the quotation, “You shall know a word by the company it keeps,” highlighting the importance of context in NLP. Meanwhile, “Context is everything in Emotion Research.” Therefore, we aimed to train a model (W2V) for generating word associations (also known as embeddings) using a popular Coronavirus Reddit forum, validate them using public evidence and apply them to the discovery of context for specific emotions previously reported as related to psychological resilience. We used Pushshiftr, quanteda, broom, wordVectors, and superheat R packages. We collected all 374,421 posts submitted by 104,351 users to Reddit/Coronavirus forum between January 2020 and July 2021. W2V identified 64 terms representing the context for seven positive emotions (gratitude, compassion, love, relief, hope, calm, and admiration) and 52 terms for seven negative emotions (anger, loneliness, boredom, fear, anxiety, confusion, sadness) all from valid experienced situations. We clustered them visually, highlighting contextual similarity. Although trained on a “small” dataset, W2V can be used for context discovery to expand on concepts such as psychological resilience.https://www.mdpi.com/2076-3417/13/11/6713COVID-19social mediaRedditnatural language processingemotionsresilience |
spellingShingle | Alejandro García-Rudolph David Sanchez-Pinsach Dietmar Frey Eloy Opisso Katryna Cisek John D. Kelleher Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus Applied Sciences COVID-19 social media natural language processing emotions resilience |
title | Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus |
title_full | Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus |
title_fullStr | Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus |
title_full_unstemmed | Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus |
title_short | Know an Emotion by the Company It Keeps: Word Embeddings from Reddit/Coronavirus |
title_sort | know an emotion by the company it keeps word embeddings from reddit coronavirus |
topic | COVID-19 social media natural language processing emotions resilience |
url | https://www.mdpi.com/2076-3417/13/11/6713 |
work_keys_str_mv | AT alejandrogarciarudolph knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus AT davidsanchezpinsach knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus AT dietmarfrey knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus AT eloyopisso knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus AT katrynacisek knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus AT johndkelleher knowanemotionbythecompanyitkeepswordembeddingsfromredditcoronavirus |