DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

AbstractFinding word boundaries in continuous speech is challenging as there is little or no equivalent of a ‘space’ delimiter between words. Popular Bayesian non-parametric models for text segmentation (Goldwater et al., 2006, 2009) use a Dirichlet process to jointly segment sentenc...

Full description

Bibliographic Details
Main Authors: Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurençon, Salah Zaiem, Abdelrahman Mohamed, Benoît Sagot, Emmanuel Dupoux
Format: Article
Language:English
Published: The MIT Press 2022-01-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00505/113018/DP-Parse-Finding-Word-Boundaries-from-Raw-Speech