Review of methods for coding of speech signals
Abstract Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews th...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2023-02-01
|
Series: | EURASIP Journal on Audio, Speech, and Music Processing |
Online Access: | https://doi.org/10.1186/s13636-023-00274-x |
_version_ | 1811165811123945472 |
---|---|
author | Douglas O’Shaughnessy |
author_facet | Douglas O’Shaughnessy |
author_sort | Douglas O’Shaughnessy |
collection | DOAJ |
description | Abstract Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews the history of speech coding techniques, from early mu-law logarithmic compression to recent neural-network methods. The techniques are examined in terms of output quality, algorithmic complexity, delay, and cost. Focus is on which aspects of speech can be exploited for high-quality transmission. The choices made to code speech are motivated by efficiency, the needs of applications, and access to information in the speech signal that is useful for both intelligibility and naturalness in the reconstructed speech at the decoder. |
first_indexed | 2024-04-10T15:42:40Z |
format | Article |
id | doaj.art-fd6d7c61140f4eb4befa3d26f3a7f1d3 |
institution | Directory Open Access Journal |
issn | 1687-4722 |
language | English |
last_indexed | 2024-04-10T15:42:40Z |
publishDate | 2023-02-01 |
publisher | SpringerOpen |
record_format | Article |
series | EURASIP Journal on Audio, Speech, and Music Processing |
spelling | doaj.art-fd6d7c61140f4eb4befa3d26f3a7f1d32023-02-12T12:18:50ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222023-02-012023112510.1186/s13636-023-00274-xReview of methods for coding of speech signalsDouglas O’Shaughnessy0INRS-EMTAbstract Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with reduced bit rates, while maintaining decoded speech quality. This paper reviews the history of speech coding techniques, from early mu-law logarithmic compression to recent neural-network methods. The techniques are examined in terms of output quality, algorithmic complexity, delay, and cost. Focus is on which aspects of speech can be exploited for high-quality transmission. The choices made to code speech are motivated by efficiency, the needs of applications, and access to information in the speech signal that is useful for both intelligibility and naturalness in the reconstructed speech at the decoder.https://doi.org/10.1186/s13636-023-00274-x |
spellingShingle | Douglas O’Shaughnessy Review of methods for coding of speech signals EURASIP Journal on Audio, Speech, and Music Processing |
title | Review of methods for coding of speech signals |
title_full | Review of methods for coding of speech signals |
title_fullStr | Review of methods for coding of speech signals |
title_full_unstemmed | Review of methods for coding of speech signals |
title_short | Review of methods for coding of speech signals |
title_sort | review of methods for coding of speech signals |
url | https://doi.org/10.1186/s13636-023-00274-x |
work_keys_str_mv | AT douglasoshaughnessy reviewofmethodsforcodingofspeechsignals |