Self-Supervised Learning of Neural Speech Representations From Unlabeled Intracranial Signals

Neuroprosthetics have demonstrated the potential to decode speech from intracranial brain signals, and hold promise for one day returning the ability to speak to those who have lost it. However, data in this domain is scarce, highly variable, and costly to label for supervised modeling. In order to...

Full description

Bibliographic Details
Main Authors:	Srdjan Lesaja, Morgan Stuart, Jerry J. Shih, Pedram Z. Soroush, Tanja Schultz, Milos Manic, Dean J. Krusienski
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	Brain–computer interface intracranial EEG deep learning transformers vector quantization speech modeling
Online Access:	https://ieeexplore.ieee.org/document/9992205/

Description
Summary:	Neuroprosthetics have demonstrated the potential to decode speech from intracranial brain signals, and hold promise for one day returning the ability to speak to those who have lost it. However, data in this domain is scarce, highly variable, and costly to label for supervised modeling. In order to address these constraints, we present brain2vec, a transformer-based approach for learning feature representations from intracranial electroencephalogram data. Brain2vec combines a self-supervised learning methodology, neuroanatomical positional embeddings, and the contextual representations of transformers to achieve three novelties: (1) learning from unlabeled intracranial brain signals, (2) learning from multiple participants simultaneously, all while (3) utilizing only raw unprocessed data. To assess our approach, we use a leave-one-participant-out validation procedure to separate brain2vec’s feature learning from the holdout participant’s speech-related supervised classification tasks. With only two linear layers, we achieve 90% accuracy on a canonical speech detection task, 42% accuracy on a more challenging 4-class speech-related behavior recognition, and 53% accuracy when applied to a 10-class, few-shot word classification task. Combined with the visualizations of unsupervised class separation in the learned features, our results evidence brain2vec’s ability to learn highly generalized representations of neural activity without the need for labels or consistent sensor location.
ISSN:	2169-3536

Self-Supervised Learning of Neural Speech Representations From Unlabeled Intracranial Signals

Similar Items