Finding Sparse Subnetworks in Self-Supervised Speech Recognition and Speech Synthesis
The modern paradigm in speech processing has demonstrated the importance of scale and compute for end-to-end speech recognition and synthesis. For instance, state-of-the-art self-supervised speech representation learning models typically consists of more than 300M model parameters and being trained...
Main Author: | Lai, Cheng-I Jeff |
---|---|
Other Authors: | Glass, James R. |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/144615 |
Similar Items
-
Evaluating Self-Supervised Speech Representations for Speech Emotion Recognition
by: Bagus Tris Atmaja, et al.
Published: (2022-01-01) -
Self-Supervised Audio-Visual Speech Diarization and Recognition
by: Wongprommoon, Arun
Published: (2024) -
Speech Representation Models for Speech Synthesis and Multimodal Speech Recognition
by: Sun, Felix (Felix W.)
Published: (2017) -
Speech synthesis and recognition/
by: 421175 Holmes, J. N.
Published: (1988) -
Speech recognition and synthesis
by: Kang, Yi Da
Published: (2023)