Self-supervised learning for Formosan speech representation and linguistic phylogeny

Formosan languages, spoken by the indigenous peoples of Taiwan, have unique roles in the reconstruction of Proto-Austronesian Languages. This paper presents a real-world Formosan language speech dataset, including 144 h of news footage for 16 Formosan languages, and uses self-supervised models to ob...

Full description

Bibliographic Details
Main Authors: Shu-Kai Hsieh, Yu-Hsiang Tseng, Da-Chen Lian, Chi-Wei Wang
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-03-01
Series:Frontiers in Language Sciences
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/flang.2024.1338684/full