Review of methods for coding of speech signals

Abstract Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews th...

Full description

Bibliographic Details
Main Author: Douglas O’Shaughnessy
Format: Article
Language:English
Published: SpringerOpen 2023-02-01
Series:EURASIP Journal on Audio, Speech, and Music Processing
Online Access:https://doi.org/10.1186/s13636-023-00274-x
_version_ 1811165811123945472
author Douglas O’Shaughnessy
author_facet Douglas O’Shaughnessy
author_sort Douglas O’Shaughnessy
collection DOAJ
description Abstract Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews the history of speech coding techniques, from early mu-law logarithmic compression to recent neural-network methods. The techniques are examined in terms of output quality, algorithmic complexity, delay, and cost. Focus is on which aspects of speech can be exploited for high-quality transmission. The choices made to code speech are motivated by efficiency, the needs of applications, and access to information in the speech signal that is useful for both intelligibility and naturalness in the reconstructed speech at the decoder.
first_indexed 2024-04-10T15:42:40Z
format Article
id doaj.art-fd6d7c61140f4eb4befa3d26f3a7f1d3
institution Directory Open Access Journal
issn 1687-4722
language English
last_indexed 2024-04-10T15:42:40Z
publishDate 2023-02-01
publisher SpringerOpen
record_format Article
series EURASIP Journal on Audio, Speech, and Music Processing
spelling doaj.art-fd6d7c61140f4eb4befa3d26f3a7f1d32023-02-12T12:18:50ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222023-02-012023112510.1186/s13636-023-00274-xReview of methods for coding of speech signalsDouglas O’Shaughnessy0INRS-EMTAbstract Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews the history of speech coding techniques, from early mu-law logarithmic compression to recent neural-network methods. The techniques are examined in terms of output quality, algorithmic complexity, delay, and cost. Focus is on which aspects of speech can be exploited for high-quality transmission. The choices made to code speech are motivated by efficiency, the needs of applications, and access to information in the speech signal that is useful for both intelligibility and naturalness in the reconstructed speech at the decoder.https://doi.org/10.1186/s13636-023-00274-x
spellingShingle Douglas O’Shaughnessy
Review of methods for coding of speech signals
EURASIP Journal on Audio, Speech, and Music Processing
title Review of methods for coding of speech signals
title_full Review of methods for coding of speech signals
title_fullStr Review of methods for coding of speech signals
title_full_unstemmed Review of methods for coding of speech signals
title_short Review of methods for coding of speech signals
title_sort review of methods for coding of speech signals
url https://doi.org/10.1186/s13636-023-00274-x
work_keys_str_mv AT douglasoshaughnessy reviewofmethodsforcodingofspeechsignals