Generalizing speech recognition techniques for monotonic sequence modeling applications

<p>This thesis demonstrates how modeling techniques from speech recognition can be advantageous in a variety of other problems. The problems to which we apply these techniques include visual-only speech recognition (lipreading), restoration of characters missing from old Hawaiian orthography b...

Full description

Bibliographic Details
Main Author: Shillingford, B
Other Authors: Lukasiewicz, T
Format: Thesis
Language:English
Published: 2019
Subjects:
Description
Summary:<p>This thesis demonstrates how modeling techniques from speech recognition can be advantageous in a variety of other problems. The problems to which we apply these techniques include visual-only speech recognition (lipreading), restoration of characters missing from old Hawaiian orthography but present in the modern orthography, and an interactive lipreading-based keyboard.</p> <p>In particular, modeling techniques from speech recognition carry two main benefits applied throughout this thesis. The first is their exploitation of monotonicity, a property found widely across sequence modeling problems besides just speech. The second is the separation of the problem into a combination of learned and fixed pieces, the latter of which facilitates the injection of expert knowledge (in the form of finite state transducers) at arbitrary places in a larger system. These features for encoding inductive biases about our problem improve data efficiency, which is especially beneficial when paired data for learning the end-to-end problem from scratch is insufficiently plentiful or entirely unavailable. We can thus get the best of both worlds: explicit representation of our expert knowledge and other inductive biases, and learned neural network components for those that are hard to model by hand and have plentiful data.</p>