Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition

Synesthesia is a phenomenon in which human experience a cross-sensory interaction in perception. However, it is hard to bridge two sensory modalities in artificial intelligence. Emotion, the universal content across multiple media modalities, can be a cue to connect sensory perceptions for developin...

Full description

Bibliographic Details
Main Authors: Baixi Xing, Kejun Zhang, Lekai Zhang, Xinda Wu, Jian Dou, Shouqian Sun
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8843988/
_version_ 1818855324909568000
author Baixi Xing
Kejun Zhang
Lekai Zhang
Xinda Wu
Jian Dou
Shouqian Sun
author_facet Baixi Xing
Kejun Zhang
Lekai Zhang
Xinda Wu
Jian Dou
Shouqian Sun
author_sort Baixi Xing
collection DOAJ
description Synesthesia is a phenomenon in which human experience a cross-sensory interaction in perception. However, it is hard to bridge two sensory modalities in artificial intelligence. Emotion, the universal content across multiple media modalities, can be a cue to connect sensory perceptions for developing computer-based synesthetic intelligence. In this study, we present an image-music, cross-synesthesia-aware model based on their similarity in the emotion space. In this experiment, we built an affective synesthesia database of 250,000 image-music pairs. Multiple music and image features were extracted to form the database. Emotional representation is abstract and complex in perception, and the recognition of emotional similarity is fraught with uncertainty. In this work, Pearson correlation coefficient (PCC) and Euclidean distance (ED) method was compared to obtain the emotional similarity labels of each affective image-music pair. The proposed method could predict emotional similarity with mean squared error of 0.0075, demonstrating the effectiveness of our approach and may shed light on the development of cross-modal synesthesia-aware systems.
first_indexed 2024-12-19T08:06:48Z
format Article
id doaj.art-55573efa87cd4baf8efd74782d67ca67
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T08:06:48Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-55573efa87cd4baf8efd74782d67ca672022-12-21T20:29:44ZengIEEEIEEE Access2169-35362019-01-01713637813639010.1109/ACCESS.2019.29420738843988Image–Music Synesthesia-Aware Learning Based on Emotional Similarity RecognitionBaixi Xing0https://orcid.org/0000-0001-6778-3650Kejun Zhang1Lekai Zhang2Xinda Wu3Jian Dou4Shouqian Sun5Institute of Industrial Design, Zhejiang University of Technology, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaSchool of Media and Design, Hangzhou Dianzi University, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaSynesthesia is a phenomenon in which human experience a cross-sensory interaction in perception. However, it is hard to bridge two sensory modalities in artificial intelligence. Emotion, the universal content across multiple media modalities, can be a cue to connect sensory perceptions for developing computer-based synesthetic intelligence. In this study, we present an image-music, cross-synesthesia-aware model based on their similarity in the emotion space. In this experiment, we built an affective synesthesia database of 250,000 image-music pairs. Multiple music and image features were extracted to form the database. Emotional representation is abstract and complex in perception, and the recognition of emotional similarity is fraught with uncertainty. In this work, Pearson correlation coefficient (PCC) and Euclidean distance (ED) method was compared to obtain the emotional similarity labels of each affective image-music pair. The proposed method could predict emotional similarity with mean squared error of 0.0075, demonstrating the effectiveness of our approach and may shed light on the development of cross-modal synesthesia-aware systems.https://ieeexplore.ieee.org/document/8843988/Cross-modal retrievalaffective computingsynesthesia-aware modelingPearson correlation coefficient
spellingShingle Baixi Xing
Kejun Zhang
Lekai Zhang
Xinda Wu
Jian Dou
Shouqian Sun
Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
IEEE Access
Cross-modal retrieval
affective computing
synesthesia-aware modeling
Pearson correlation coefficient
title Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
title_full Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
title_fullStr Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
title_full_unstemmed Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
title_short Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
title_sort image x2013 music synesthesia aware learning based on emotional similarity recognition
topic Cross-modal retrieval
affective computing
synesthesia-aware modeling
Pearson correlation coefficient
url https://ieeexplore.ieee.org/document/8843988/
work_keys_str_mv AT baixixing imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition
AT kejunzhang imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition
AT lekaizhang imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition
AT xindawu imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition
AT jiandou imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition
AT shouqiansun imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition