Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels

The separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net arc...

Full description

Bibliographic Details
Main Authors: Markus Schwabe, Michael Heizmann
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10109737/
_version_ 1797829168959848448
author Markus Schwabe
Michael Heizmann
author_facet Markus Schwabe
Michael Heizmann
author_sort Markus Schwabe
collection DOAJ
description The separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net architecture is extended to incorporate additional time-dependent instrument activity information for improved instrument track extractions. Different stages are investigated to integrate the additional information, but an input before the deepest encoder block achieves best separation results as well as highest robustness against randomly wrong labels. This approach outperforms a label integration by multiplication and the input of a static instrument label. Targeted data augmentation by incoherent mixtures is used for a trio example of violin, trumpet, and flute to improve separation results. Moreover, an alternative separation approach with one independent separation model for each instrument is investigated, which enables a more flexible architecture. In this case, an input after the deepest encoder block achieves best separation results, but the robustness is slightly reduced compared to the joint model. The improvements by additional information on active instruments are verified by using real instrument activity predictions for both the joint and the independent separation approaches.
first_indexed 2024-04-09T13:16:07Z
format Article
id doaj.art-e0fc645539cf4b60bf8cbce40e6a6260
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-09T13:16:07Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-e0fc645539cf4b60bf8cbce40e6a62602023-05-11T23:00:43ZengIEEEIEEE Access2169-35362023-01-0111429994300710.1109/ACCESS.2023.327114610109737Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity LabelsMarkus Schwabe0https://orcid.org/0009-0000-5827-9516Michael Heizmann1https://orcid.org/0000-0001-9339-2055Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), Karlsruhe, GermanyInstitute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), Karlsruhe, GermanyThe separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net architecture is extended to incorporate additional time-dependent instrument activity information for improved instrument track extractions. Different stages are investigated to integrate the additional information, but an input before the deepest encoder block achieves best separation results as well as highest robustness against randomly wrong labels. This approach outperforms a label integration by multiplication and the input of a static instrument label. Targeted data augmentation by incoherent mixtures is used for a trio example of violin, trumpet, and flute to improve separation results. Moreover, an alternative separation approach with one independent separation model for each instrument is investigated, which enables a more flexible architecture. In this case, an input after the deepest encoder block achieves best separation results, but the robustness is slightly reduced compared to the joint model. The improvements by additional information on active instruments are verified by using real instrument activity predictions for both the joint and the independent separation approaches.https://ieeexplore.ieee.org/document/10109737/Music source separationpolyphonic chamber musicactive instrumentsend-to-end deep learning
spellingShingle Markus Schwabe
Michael Heizmann
Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
IEEE Access
Music source separation
polyphonic chamber music
active instruments
end-to-end deep learning
title Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
title_full Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
title_fullStr Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
title_full_unstemmed Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
title_short Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
title_sort improved separation of polyphonic chamber music signals by integrating instrument activity labels
topic Music source separation
polyphonic chamber music
active instruments
end-to-end deep learning
url https://ieeexplore.ieee.org/document/10109737/
work_keys_str_mv AT markusschwabe improvedseparationofpolyphonicchambermusicsignalsbyintegratinginstrumentactivitylabels
AT michaelheizmann improvedseparationofpolyphonicchambermusicsignalsbyintegratinginstrumentactivitylabels