A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
Most speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pa...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8999553/ |
_version_ | 1818330164966195200 |
---|---|
author | Yibo Huang Hexiang Hou Yong Wang Yuan Zhang Manhong Fan |
author_facet | Yibo Huang Hexiang Hou Yong Wang Yuan Zhang Manhong Fan |
author_sort | Yibo Huang |
collection | DOAJ |
description | Most speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pay attention to the research on the discrimination of hashing sequence length, so this paper proposes a long sequence speech authentication algorithm based on constant Q transform (CQT) and tensor decomposition (TD). In this paper, hashing long sequence is used to solve the problem of poor collision resistance of existing algorithms, fast and accurate authentication can be achieved for important speech fragments with large data volumes. The sub-band in the frequency domain are first divided into different matrix, then the variance set of sub-band in the frequency domain is obtained, and finally the feature values are obtained by CQT and TD transformation. The obtained feature values have strong robustness and can cope with the interference of complex channel environment. In this paper, Texas Instruments and Massachusetts Institute of Technology (TIMIT) speech database and the Text to Speech (TTS) are used to establish a database of 51600 speeches to verify the performance of the algorithm. Experimental results show that compared with the existing speech authentication algorithms, the proposed algorithm has the characteristics of high discrimination, strong robustness and high efficiency. |
first_indexed | 2024-12-13T12:59:37Z |
format | Article |
id | doaj.art-a9287e520ae848459af56a1b11c34f6a |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-13T12:59:37Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-a9287e520ae848459af56a1b11c34f6a2022-12-21T23:45:04ZengIEEEIEEE Access2169-35362020-01-018341403415210.1109/ACCESS.2020.29740298999553A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor DecompositionYibo Huang0https://orcid.org/0000-0003-1667-3114Hexiang Hou1Yong Wang2Yuan Zhang3Manhong Fan4College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaMost speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pay attention to the research on the discrimination of hashing sequence length, so this paper proposes a long sequence speech authentication algorithm based on constant Q transform (CQT) and tensor decomposition (TD). In this paper, hashing long sequence is used to solve the problem of poor collision resistance of existing algorithms, fast and accurate authentication can be achieved for important speech fragments with large data volumes. The sub-band in the frequency domain are first divided into different matrix, then the variance set of sub-band in the frequency domain is obtained, and finally the feature values are obtained by CQT and TD transformation. The obtained feature values have strong robustness and can cope with the interference of complex channel environment. In this paper, Texas Instruments and Massachusetts Institute of Technology (TIMIT) speech database and the Text to Speech (TTS) are used to establish a database of 51600 speeches to verify the performance of the algorithm. Experimental results show that compared with the existing speech authentication algorithms, the proposed algorithm has the characteristics of high discrimination, strong robustness and high efficiency.https://ieeexplore.ieee.org/document/8999553/Speech authenticationperceptual hashingCQTTDhashing long sequencediscrimination |
spellingShingle | Yibo Huang Hexiang Hou Yong Wang Yuan Zhang Manhong Fan A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition IEEE Access Speech authentication perceptual hashing CQT TD hashing long sequence discrimination |
title | A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition |
title_full | A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition |
title_fullStr | A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition |
title_full_unstemmed | A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition |
title_short | A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition |
title_sort | long sequence speech perceptual hashing authentication algorithm based on constant q transform and tensor decomposition |
topic | Speech authentication perceptual hashing CQT TD hashing long sequence discrimination |
url | https://ieeexplore.ieee.org/document/8999553/ |
work_keys_str_mv | AT yibohuang alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT hexianghou alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT yongwang alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT yuanzhang alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT manhongfan alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT yibohuang longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT hexianghou longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT yongwang longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT yuanzhang longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition AT manhongfan longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition |