Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition
Synesthesia is a phenomenon in which human experience a cross-sensory interaction in perception. However, it is hard to bridge two sensory modalities in artificial intelligence. Emotion, the universal content across multiple media modalities, can be a cue to connect sensory perceptions for developin...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8843988/ |
_version_ | 1818855324909568000 |
---|---|
author | Baixi Xing Kejun Zhang Lekai Zhang Xinda Wu Jian Dou Shouqian Sun |
author_facet | Baixi Xing Kejun Zhang Lekai Zhang Xinda Wu Jian Dou Shouqian Sun |
author_sort | Baixi Xing |
collection | DOAJ |
description | Synesthesia is a phenomenon in which human experience a cross-sensory interaction in perception. However, it is hard to bridge two sensory modalities in artificial intelligence. Emotion, the universal content across multiple media modalities, can be a cue to connect sensory perceptions for developing computer-based synesthetic intelligence. In this study, we present an image-music, cross-synesthesia-aware model based on their similarity in the emotion space. In this experiment, we built an affective synesthesia database of 250,000 image-music pairs. Multiple music and image features were extracted to form the database. Emotional representation is abstract and complex in perception, and the recognition of emotional similarity is fraught with uncertainty. In this work, Pearson correlation coefficient (PCC) and Euclidean distance (ED) method was compared to obtain the emotional similarity labels of each affective image-music pair. The proposed method could predict emotional similarity with mean squared error of 0.0075, demonstrating the effectiveness of our approach and may shed light on the development of cross-modal synesthesia-aware systems. |
first_indexed | 2024-12-19T08:06:48Z |
format | Article |
id | doaj.art-55573efa87cd4baf8efd74782d67ca67 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T08:06:48Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-55573efa87cd4baf8efd74782d67ca672022-12-21T20:29:44ZengIEEEIEEE Access2169-35362019-01-01713637813639010.1109/ACCESS.2019.29420738843988Image–Music Synesthesia-Aware Learning Based on Emotional Similarity RecognitionBaixi Xing0https://orcid.org/0000-0001-6778-3650Kejun Zhang1Lekai Zhang2Xinda Wu3Jian Dou4Shouqian Sun5Institute of Industrial Design, Zhejiang University of Technology, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaSchool of Media and Design, Hangzhou Dianzi University, Hangzhou, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaSynesthesia is a phenomenon in which human experience a cross-sensory interaction in perception. However, it is hard to bridge two sensory modalities in artificial intelligence. Emotion, the universal content across multiple media modalities, can be a cue to connect sensory perceptions for developing computer-based synesthetic intelligence. In this study, we present an image-music, cross-synesthesia-aware model based on their similarity in the emotion space. In this experiment, we built an affective synesthesia database of 250,000 image-music pairs. Multiple music and image features were extracted to form the database. Emotional representation is abstract and complex in perception, and the recognition of emotional similarity is fraught with uncertainty. In this work, Pearson correlation coefficient (PCC) and Euclidean distance (ED) method was compared to obtain the emotional similarity labels of each affective image-music pair. The proposed method could predict emotional similarity with mean squared error of 0.0075, demonstrating the effectiveness of our approach and may shed light on the development of cross-modal synesthesia-aware systems.https://ieeexplore.ieee.org/document/8843988/Cross-modal retrievalaffective computingsynesthesia-aware modelingPearson correlation coefficient |
spellingShingle | Baixi Xing Kejun Zhang Lekai Zhang Xinda Wu Jian Dou Shouqian Sun Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition IEEE Access Cross-modal retrieval affective computing synesthesia-aware modeling Pearson correlation coefficient |
title | Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition |
title_full | Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition |
title_fullStr | Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition |
title_full_unstemmed | Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition |
title_short | Image–Music Synesthesia-Aware Learning Based on Emotional Similarity Recognition |
title_sort | image x2013 music synesthesia aware learning based on emotional similarity recognition |
topic | Cross-modal retrieval affective computing synesthesia-aware modeling Pearson correlation coefficient |
url | https://ieeexplore.ieee.org/document/8843988/ |
work_keys_str_mv | AT baixixing imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition AT kejunzhang imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition AT lekaizhang imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition AT xindawu imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition AT jiandou imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition AT shouqiansun imagex2013musicsynesthesiaawarelearningbasedonemotionalsimilarityrecognition |