Automatic Personality Evaluation from Transliterations of YouTube Vlogs Using Classical and State of the art Word Embeddings

The study of automatic personality recognition has gained attention in the last decade thanks to a variety of applications that derive from this field. The big five model (also known as OCEAN) constitutes a well-known method to label different personality traits. This work considers transliterations...

Full description

Bibliographic Details
Main Authors: Felipe Orlando López Pabón, Juan Rafael Orozco Arroyave
Format: Article
Language:English
Published: Universidad Nacional de Colombia 2021-12-01
Series:Ingeniería e Investigación
Subjects:
Online Access:https://revistas.unal.edu.co/index.php/ingeinv/article/view/93803
Description
Summary:The study of automatic personality recognition has gained attention in the last decade thanks to a variety of applications that derive from this field. The big five model (also known as OCEAN) constitutes a well-known method to label different personality traits. This work considers transliterations of video recordings collected from YouTube (originally provided by the Idiap research institute) and automatically generated scores for the five personality traits which also were provided in the database. The transliterations are modeled with two different word embedding approaches, Word2Vec and GloVe and three different levels of analysis are included: regression to predict the score of each personality trait, binary classification between strong vs. weak presence of each trait, and the tri-class classification according to three different levels of manifestations in each trait (low, medium, and high). According to our findings, the proposed approach provides similar results to others reported in the state-of-the-art. We think that further research is required to find better results. Our results, as well as others reported in the literature, suggest that there is a big gap in the study of personality traits based on linguistic patterns, which make it necessary to work on collecting and labeling data considering the knowledge of expert psychologists and psycholinguists.
ISSN:0120-5609
2248-8723