Summary: | Spoken discourse is a uniquely valuable source of data in cognitive research. A natural way of representing spoken discourse is in the form of a transcript in standard orthography. However, since transcribing is, for neuroscientists at any rate, no more than a means to an end, many researchers give only cursory descriptions of the transcription process, including the assessment of agreement between transcribers. This article introduces a novel approach to the systematic assessment of agreement between transcripts. The method first involves the automated alignment of two texts, followed by the automatic identification and quantification of discrepancies. A similarity score is then computed, providing researchers with a tool to evaluate the accuracy of the pair of transcripts in question. Most importantly, the automated production of a set of comparison tables reveals and summarizes the actual mismatches found, making it possible to identify common causes of discrepancy. Through applying this approach to medical data collected for an investigation of dementia, the present study demonstrates its value in the amendment of transcripts and the improvement of transcription practices, which pave the way towards more reliable transcriptions for research purposes. © The Author 2011. Published by Oxford University Press on behalf of ALLC. All rights reserved.
|