Semantic Coherence Dataset: Speech transcripts

The Semantic Coherence Dataset has been designed to experiment with semantic coherence metrics. More specifically, the dataset has been built to the ends of testing whether probabilistic measures, such as perplexity, provide stable scores to analyze spoken language. Perplexity, which was originally...

Full description

Bibliographic Details
Main Authors: Davide Colla, Matteo Delsanto, Daniele P. Radicioni
Format: Article
Language:English
Published: Elsevier 2023-02-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340922010022