A data structure for representing multi-version texts online

The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). T...

Full description

Bibliographic Details
Main Authors: Schmidt, Desmond, Colomb, Robert
Format: Article
Published: Elsevier 2009
Subjects:
_version_ 1796854942940528640
author Schmidt, Desmond
Colomb, Robert
author_facet Schmidt, Desmond
Colomb, Robert
author_sort Schmidt, Desmond
collection ePrints
description The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). These two problems can be reduced to one by observing that every case of overlapping hierarchies is also a case of textual variation. Overlapping textual structures can be accurately modelled either as a minimally redundant directed graph, or, more practically, as an ordered list of pairs, each containing a set of versions and a fragment of text or data. This 'pairs-list' representation is provably equivalent to the graph representation. It can record texts consisting of thousands of versions or perspectives without becoming overloaded with data, and the most common operations on variant text, e.g. comparison between two versions, can be performed in linear time. This representation also separates variation or other overlapping structures from the document content, leading to a simplification of markup suitable for wiki-like web applications.
first_indexed 2024-03-05T18:21:31Z
format Article
id utm.eprints-11771
institution Universiti Teknologi Malaysia - ePrints
last_indexed 2024-03-05T18:21:31Z
publishDate 2009
publisher Elsevier
record_format dspace
spelling utm.eprints-117712011-01-17T12:34:57Z http://eprints.utm.my/11771/ A data structure for representing multi-version texts online Schmidt, Desmond Colomb, Robert QA75 Electronic computers. Computer science The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). These two problems can be reduced to one by observing that every case of overlapping hierarchies is also a case of textual variation. Overlapping textual structures can be accurately modelled either as a minimally redundant directed graph, or, more practically, as an ordered list of pairs, each containing a set of versions and a fragment of text or data. This 'pairs-list' representation is provably equivalent to the graph representation. It can record texts consisting of thousands of versions or perspectives without becoming overloaded with data, and the most common operations on variant text, e.g. comparison between two versions, can be performed in linear time. This representation also separates variation or other overlapping structures from the document content, leading to a simplification of markup suitable for wiki-like web applications. Elsevier 2009-06 Article PeerReviewed Schmidt, Desmond and Colomb, Robert (2009) A data structure for representing multi-version texts online. International Journal of Human Computer Studies, 67 (6). pp. 497-514. ISSN 1071-5819 http://dx.doi.org/10.1016/j.ijhcs.2009.02.001 doi:10.1016/j.ijhcs.2009.02.001
spellingShingle QA75 Electronic computers. Computer science
Schmidt, Desmond
Colomb, Robert
A data structure for representing multi-version texts online
title A data structure for representing multi-version texts online
title_full A data structure for representing multi-version texts online
title_fullStr A data structure for representing multi-version texts online
title_full_unstemmed A data structure for representing multi-version texts online
title_short A data structure for representing multi-version texts online
title_sort data structure for representing multi version texts online
topic QA75 Electronic computers. Computer science
work_keys_str_mv AT schmidtdesmond adatastructureforrepresentingmultiversiontextsonline
AT colombrobert adatastructureforrepresentingmultiversiontextsonline
AT schmidtdesmond datastructureforrepresentingmultiversiontextsonline
AT colombrobert datastructureforrepresentingmultiversiontextsonline