A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition

Most speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pa...

Full description

Bibliographic Details
Main Authors: Yibo Huang, Hexiang Hou, Yong Wang, Yuan Zhang, Manhong Fan
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8999553/
_version_ 1818330164966195200
author Yibo Huang
Hexiang Hou
Yong Wang
Yuan Zhang
Manhong Fan
author_facet Yibo Huang
Hexiang Hou
Yong Wang
Yuan Zhang
Manhong Fan
author_sort Yibo Huang
collection DOAJ
description Most speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pay attention to the research on the discrimination of hashing sequence length, so this paper proposes a long sequence speech authentication algorithm based on constant Q transform (CQT) and tensor decomposition (TD). In this paper, hashing long sequence is used to solve the problem of poor collision resistance of existing algorithms, fast and accurate authentication can be achieved for important speech fragments with large data volumes. The sub-band in the frequency domain are first divided into different matrix, then the variance set of sub-band in the frequency domain is obtained, and finally the feature values are obtained by CQT and TD transformation. The obtained feature values have strong robustness and can cope with the interference of complex channel environment. In this paper, Texas Instruments and Massachusetts Institute of Technology (TIMIT) speech database and the Text to Speech (TTS) are used to establish a database of 51600 speeches to verify the performance of the algorithm. Experimental results show that compared with the existing speech authentication algorithms, the proposed algorithm has the characteristics of high discrimination, strong robustness and high efficiency.
first_indexed 2024-12-13T12:59:37Z
format Article
id doaj.art-a9287e520ae848459af56a1b11c34f6a
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-13T12:59:37Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-a9287e520ae848459af56a1b11c34f6a2022-12-21T23:45:04ZengIEEEIEEE Access2169-35362020-01-018341403415210.1109/ACCESS.2020.29740298999553A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor DecompositionYibo Huang0https://orcid.org/0000-0003-1667-3114Hexiang Hou1Yong Wang2Yuan Zhang3Manhong Fan4College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaCollege of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, ChinaMost speech authentication algorithms are over-optimized for robustness and efficiency, resulting in poor discrimination. Hashing shorter sequence is likely to cause the same hashing sequence to come from different speech segments, which will cause serious deviations in authentication. Few people pay attention to the research on the discrimination of hashing sequence length, so this paper proposes a long sequence speech authentication algorithm based on constant Q transform (CQT) and tensor decomposition (TD). In this paper, hashing long sequence is used to solve the problem of poor collision resistance of existing algorithms, fast and accurate authentication can be achieved for important speech fragments with large data volumes. The sub-band in the frequency domain are first divided into different matrix, then the variance set of sub-band in the frequency domain is obtained, and finally the feature values are obtained by CQT and TD transformation. The obtained feature values have strong robustness and can cope with the interference of complex channel environment. In this paper, Texas Instruments and Massachusetts Institute of Technology (TIMIT) speech database and the Text to Speech (TTS) are used to establish a database of 51600 speeches to verify the performance of the algorithm. Experimental results show that compared with the existing speech authentication algorithms, the proposed algorithm has the characteristics of high discrimination, strong robustness and high efficiency.https://ieeexplore.ieee.org/document/8999553/Speech authenticationperceptual hashingCQTTDhashing long sequencediscrimination
spellingShingle Yibo Huang
Hexiang Hou
Yong Wang
Yuan Zhang
Manhong Fan
A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
IEEE Access
Speech authentication
perceptual hashing
CQT
TD
hashing long sequence
discrimination
title A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
title_full A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
title_fullStr A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
title_full_unstemmed A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
title_short A Long Sequence Speech Perceptual Hashing Authentication Algorithm Based on Constant Q Transform and Tensor Decomposition
title_sort long sequence speech perceptual hashing authentication algorithm based on constant q transform and tensor decomposition
topic Speech authentication
perceptual hashing
CQT
TD
hashing long sequence
discrimination
url https://ieeexplore.ieee.org/document/8999553/
work_keys_str_mv AT yibohuang alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT hexianghou alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT yongwang alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT yuanzhang alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT manhongfan alongsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT yibohuang longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT hexianghou longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT yongwang longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT yuanzhang longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition
AT manhongfan longsequencespeechperceptualhashingauthenticationalgorithmbasedonconstantqtransformandtensordecomposition