Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams

Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams

In this paper, we present an unsupervised learning framework to address the problem of detecting spoken keywords. Without any transcription information, a Gaussian Mixture Model is trained to label speech frames with a Gaussian posteriorgram. Given one or more spoken examples of a keyword, we use se...

Full description

Bibliographic Details
Main Authors:	Glass, James R., Zhang, Yaodong, Ph. D. Massachusetts Institute of Technology
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2012
Online Access:	http://hdl.handle.net/1721.1/73507 https://orcid.org/0000-0002-3097-360X

Similar Items

Unsupervised spoken keyword spotting and learning of acoustically meaningful units
by: Zhang, Yaodong, Ph. D. Massachusetts Institute of Technology
Published: (2010)

Morphological segmentation : an unsupervised method and application to Keyword Spotting
by: Narasimhan, Karthik Rajagopal
Published: (2014)

Unsupervised speech processing with applications to query-by-example spoken term detection
by: Zhang, Yaodong, Ph. D. Massachusetts Institute of Technology
Published: (2013)

Morphological Segmentation for Keyword Spotting
by: Narasimhan, Karthik Rajagopal, et al.
Published: (2015)

Unsupervised learning of spoken language with visual context
by: Harwath, David, et al.
Published: (2020)

A study on out-of-vocabulary word modelling for a segment-based keyword spotting system
by: Manos, Alexandros Sterios
Published: (2007)

Algorithms and low power hardware for keyword spotting
by: Wang, Miaorong
Published: (2018)

Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
by: Palmer, Ian, et al.
Published: (2022)

Small footprint model for noisy far-field keyword spotting
by: Pang, Jin Hui
Published: (2022)

Discourse segmentation of spoken dialogue : an empirical approach
by: Flammia, Giovanni, 1963-
Published: (2005)

Minimum cut model for spoken lecture segmentation
by: Malioutov, Igor (Igor Mikhailovich)
Published: (2010)

Spoken Malay language influence on automatic transcription and segmentation
by: Husni, Husniza, et al.
Published: (2013)

Spoken command of large mobile robots in outdoor environments
by: Chuangsuwanich, Ekapol, et al.
Published: (2011)

Speech rhythm guided syllable nuclei detection
by: Glass, James R., et al.
Published: (2010)

NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
by: Sudirman, Rubita, et al.
Published: (2006)

Unsupervised multilingual learning
by: Snyder, Benjamin, Ph. D. Massachusetts Institute of Technology
Published: (2011)

Unsupervised multi-texture image segmentation
by: Neena Mittal
Published: (2008)

NN speech recognition utilizing aligned DTW local distance scores
by: Sudirman, Rubita, et al.
Published: (2005)

The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
by: Sudirman, Rubita, et al.
Published: (2006)

SmooSeg: smoothness prior for unsupervised semantic segmentation
by: Lan, Mengcheng, et al.
Published: (2024)

Local DTW coefficients and pitch feature for back-propagation NN digits recognition
by: Sudirman, R., et al.
Published: (2006)

Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition
by: Sudirman, Rubita, et al.
Published: (2006)

Beyond Keywords
by: Tounaka, Nobuaki, et al.
Published: (2019)

Unsupervised image segmentation using robust clustering
by: Pan, Hong
Published: (2008)

Unsupervised domain adaptation for LiDAR segmentation
by: Kong, Lingdong
Published: (2022)

Unsupervised action segmentation in videos with clustering algorithms
by: Lim, Isaac Sheng Yang
Published: (2024)

Unsupervised learning of morphological forests
by: Luo, Jiaming, (Scientist in Electrical Engineering and Computer Science) Massachusetts Institute of Technology
Published: (2017)

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
by: Harwath, David, et al.
Published: (2021)

Unsupervised Lexicon Discovery from Acoustic Input
by: Lee, Chia-ying, et al.
Published: (2015)

Fractional means based method for multi-oriented keyword spotting in video/scene/license plate images
by: Shivakumara, Palaiahnakote, et al.
Published: (2019)

Unsupervised summarization of public talk radio
by: O'Brien, Shayne,S.M.Massachusetts Institute of Technology.
Published: (2020)

Unsupervised learning of lexical subclasses from phonotactics
by: Morita, Takashi, Ph. D. Massachusetts Institute of Technology
Published: (2019)

Keyword Join: Realizing Keyword Search for Information Integration
by: Yu, Bei, et al.
Published: (2005)

Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation
by: Dalca, Adrian Vasile, et al.
Published: (2020)

Unsupervised Deep Learning for Bayesian Brain MRI Segmentation
by: Dalca, Adrian Vasile, et al.
Published: (2021)

Programming with keywords
by: Little, Greg (Danny Greg)
Published: (2008)

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
by: Harwath, David F., et al.
Published: (2020)

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
by: Harwath, David F., et al.
Published: (2022)

Validity of LIWC in measuring personality expression via spoken language
by: Goh, Bei Jun, et al.
Published: (2023)

Blood cell image segmentation using unsupervised clustering techniques
by: Tuan Muda, Tuan Zalizam, et al.
Published: (2009)