Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels
The separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net arc...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10109737/ |
_version_ | 1797829168959848448 |
---|---|
author | Markus Schwabe Michael Heizmann |
author_facet | Markus Schwabe Michael Heizmann |
author_sort | Markus Schwabe |
collection | DOAJ |
description | The separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net architecture is extended to incorporate additional time-dependent instrument activity information for improved instrument track extractions. Different stages are investigated to integrate the additional information, but an input before the deepest encoder block achieves best separation results as well as highest robustness against randomly wrong labels. This approach outperforms a label integration by multiplication and the input of a static instrument label. Targeted data augmentation by incoherent mixtures is used for a trio example of violin, trumpet, and flute to improve separation results. Moreover, an alternative separation approach with one independent separation model for each instrument is investigated, which enables a more flexible architecture. In this case, an input after the deepest encoder block achieves best separation results, but the robustness is slightly reduced compared to the joint model. The improvements by additional information on active instruments are verified by using real instrument activity predictions for both the joint and the independent separation approaches. |
first_indexed | 2024-04-09T13:16:07Z |
format | Article |
id | doaj.art-e0fc645539cf4b60bf8cbce40e6a6260 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-09T13:16:07Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e0fc645539cf4b60bf8cbce40e6a62602023-05-11T23:00:43ZengIEEEIEEE Access2169-35362023-01-0111429994300710.1109/ACCESS.2023.327114610109737Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity LabelsMarkus Schwabe0https://orcid.org/0009-0000-5827-9516Michael Heizmann1https://orcid.org/0000-0001-9339-2055Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), Karlsruhe, GermanyInstitute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), Karlsruhe, GermanyThe separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net architecture is extended to incorporate additional time-dependent instrument activity information for improved instrument track extractions. Different stages are investigated to integrate the additional information, but an input before the deepest encoder block achieves best separation results as well as highest robustness against randomly wrong labels. This approach outperforms a label integration by multiplication and the input of a static instrument label. Targeted data augmentation by incoherent mixtures is used for a trio example of violin, trumpet, and flute to improve separation results. Moreover, an alternative separation approach with one independent separation model for each instrument is investigated, which enables a more flexible architecture. In this case, an input after the deepest encoder block achieves best separation results, but the robustness is slightly reduced compared to the joint model. The improvements by additional information on active instruments are verified by using real instrument activity predictions for both the joint and the independent separation approaches.https://ieeexplore.ieee.org/document/10109737/Music source separationpolyphonic chamber musicactive instrumentsend-to-end deep learning |
spellingShingle | Markus Schwabe Michael Heizmann Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels IEEE Access Music source separation polyphonic chamber music active instruments end-to-end deep learning |
title | Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels |
title_full | Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels |
title_fullStr | Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels |
title_full_unstemmed | Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels |
title_short | Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels |
title_sort | improved separation of polyphonic chamber music signals by integrating instrument activity labels |
topic | Music source separation polyphonic chamber music active instruments end-to-end deep learning |
url | https://ieeexplore.ieee.org/document/10109737/ |
work_keys_str_mv | AT markusschwabe improvedseparationofpolyphonicchambermusicsignalsbyintegratinginstrumentactivitylabels AT michaelheizmann improvedseparationofpolyphonicchambermusicsignalsbyintegratinginstrumentactivitylabels |