Finding Sparse Subnetworks in Self-Supervised Speech Recognition and Speech Synthesis

Finding Sparse Subnetworks in Self-Supervised Speech Recognition and Speech Synthesis

The modern paradigm in speech processing has demonstrated the importance of scale and compute for end-to-end speech recognition and synthesis. For instance, state-of-the-art self-supervised speech representation learning models typically consists of more than 300M model parameters and being trained...

Full description

Bibliographic Details
Main Author:	Lai, Cheng-I Jeff
Other Authors:	Glass, James R.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/144615

Similar Items

Evaluating Self-Supervised Speech Representations for Speech Emotion Recognition
by: Bagus Tris Atmaja, et al.
Published: (2022-01-01)

Self-Supervised Audio-Visual Speech Diarization and Recognition
by: Wongprommoon, Arun
Published: (2024)

Speech Representation Models for Speech Synthesis and Multimodal Speech Recognition
by: Sun, Felix (Felix W.)
Published: (2017)

Speech synthesis and recognition/
by: 421175 Holmes, J. N.
Published: (1988)

Speech recognition and synthesis
by: Kang, Yi Da
Published: (2023)

Self-Supervised Learning for Speech Processing
by: Chung, Yu-An
Published: (2022)

Digital speech processing: speech coding, synthesis, and recognition /
by: Ince, A. Nejat
Published: (1992)

Grammar-Supervised End-to-End Speech Recognition with Part-of-Speech Tagging and Dependency Parsing
by: Genshun Wan, et al.
Published: (2023-03-01)

Speech Recognition for Task Domains with Sparse Matched Training Data
by: Byung Ok Kang, et al.
Published: (2020-09-01)

Dual supervised learning for non-native speech recognition
by: Kacper Radzikowski, et al.
Published: (2019-01-01)

Connected speech recognition systems /
by: 302081 Wilson, Jeff

Speech synthesis and recognition systems /
by: 225963 Yannakoudakis, E. J., et al.
Published: (1987)

Speech recognition and speech synthesis design for students information services /
by: 441320 Ling, Kee Soon
Published: (2001)

Cross-Corpus Training Strategy for Speech Emotion Recognition Using Self-Supervised Representations
by: Miguel A. Pastor, et al.
Published: (2023-08-01)

Speech2Action: Cross-modal supervision for action recognition
by: Nagrani, A, et al.
Published: (2020)

Speech recognition and speech synthesis design for students information services [microfilm] /
by: Ling, Kee Soon
Published: (2001)

Speech recognition and text-to-speech synthesis for student matric number and CPA /
by: 429495 Foo, Chin Ho
Published: (2001)

Digital speech processing, synthesis and recognition /
by: 427782 Furui, Sadaoki
Published: (2001)

Adapting Pre-Trained Self-Supervised Learning Model for Speech Recognition with Light-Weight Adapters
by: Xianghu Yue, et al.
Published: (2024-01-01)

Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System
by: Mohammed Hasan Ali, et al.
Published: (2022-01-01)

Finding acoustic regularities in speech : applications to phonetic recognition
Published: (2004)

Finding acoustic regularities in speech : applications to phonetic recognition
by: Glass, James Robert
Published: (2005)

Speech recognition and text-to-speech synthesis for student matric number, CPA and GPA /
by: 445693 Ee, Su Fun
Published: (2002)

SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources
by: Takaaki Saeki, et al.
Published: (2023-01-01)

Synthesis of speech
by: Yeo, Poh Cheng
Published: (2015)

Speech recognition /
by: 207859 Mohammad Masroor Ahmed, et al.
Published: (2004)

Speech Recognition
by: Adrian Morariu
Published: (2009-01-01)

Self-supervised contrastive video-speech representation learning for ultrasound
by: Jiao, J, et al.
Published: (2020)

Disentangled Speech Embeddings Using Cross-Modal Self-Supervision
by: Nagrani, A, et al.
Published: (2020)

Self-supervised learning for Formosan speech representation and linguistic phylogeny
by: Shu-Kai Hsieh, et al.
Published: (2024-03-01)

KsponSpeech: Korean Spontaneous Speech Corpus for Automatic Speech Recognition
by: Jeong-Uk Bang, et al.
Published: (2020-10-01)

Fundamentals of speech synthesis and speech recognition : basic concepts, state-of-the-art and future challenges /
by: Keller, Eric
Published: (1994)

Deep Learning, Ensemble and Supervised Machine Learning for Arabic Speech Emotion Recognition
by: Wahiba Ismaiel, et al.
Published: (2024-04-01)

Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient
by: Hoon Chung, et al.
Published: (2020-05-01)

Semi-Supervised Learning for Robust Emotional Speech Synthesis with Limited Data
by: Jialin Zhang, et al.
Published: (2023-05-01)

A Survey of Automatic Speech Recognition for Dysarthric Speech
by: Zhaopeng Qian, et al.
Published: (2023-10-01)

Hand Gesture Recognition and Conversion to Speech for Speech Impaired
by: Annapoorna E., et al.
Published: (2023-01-01)

Combined spectral and speech features for pig speech recognition.
by: Xuan Wu, et al.
Published: (2022-01-01)

Combined spectral and speech features for pig speech recognition
by: Xuan Wu, et al.
Published: (2022-01-01)

Robust speech features and acoustic models for speech recognition
by: Xiao, Xiong
Published: (2010)