A data structure for representing multi-version texts online
The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). T...
Main Authors: | , |
---|---|
Format: | Article |
Published: |
Elsevier
2009
|
Subjects: |
_version_ | 1796854942940528640 |
---|---|
author | Schmidt, Desmond Colomb, Robert |
author_facet | Schmidt, Desmond Colomb, Robert |
author_sort | Schmidt, Desmond |
collection | ePrints |
description | The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). These two problems can be reduced to one by observing that every case of overlapping hierarchies is also a case of textual variation. Overlapping textual structures can be accurately modelled either as a minimally redundant directed graph, or, more practically, as an ordered list of pairs, each containing a set of versions and a fragment of text or data. This 'pairs-list' representation is provably equivalent to the graph representation. It can record texts consisting of thousands of versions or perspectives without becoming overloaded with data, and the most common operations on variant text, e.g. comparison between two versions, can be performed in linear time. This representation also separates variation or other overlapping structures from the document content, leading to a simplification of markup suitable for wiki-like web applications. |
first_indexed | 2024-03-05T18:21:31Z |
format | Article |
id | utm.eprints-11771 |
institution | Universiti Teknologi Malaysia - ePrints |
last_indexed | 2024-03-05T18:21:31Z |
publishDate | 2009 |
publisher | Elsevier |
record_format | dspace |
spelling | utm.eprints-117712011-01-17T12:34:57Z http://eprints.utm.my/11771/ A data structure for representing multi-version texts online Schmidt, Desmond Colomb, Robert QA75 Electronic computers. Computer science The digitisation of cultural heritage and linguistics texts has long been troubled by the problem of how to represent overlapping structures arising from different markup perspectives ('overlapping hierarchies') or from different versions of the same work ('textual variation'). These two problems can be reduced to one by observing that every case of overlapping hierarchies is also a case of textual variation. Overlapping textual structures can be accurately modelled either as a minimally redundant directed graph, or, more practically, as an ordered list of pairs, each containing a set of versions and a fragment of text or data. This 'pairs-list' representation is provably equivalent to the graph representation. It can record texts consisting of thousands of versions or perspectives without becoming overloaded with data, and the most common operations on variant text, e.g. comparison between two versions, can be performed in linear time. This representation also separates variation or other overlapping structures from the document content, leading to a simplification of markup suitable for wiki-like web applications. Elsevier 2009-06 Article PeerReviewed Schmidt, Desmond and Colomb, Robert (2009) A data structure for representing multi-version texts online. International Journal of Human Computer Studies, 67 (6). pp. 497-514. ISSN 1071-5819 http://dx.doi.org/10.1016/j.ijhcs.2009.02.001 doi:10.1016/j.ijhcs.2009.02.001 |
spellingShingle | QA75 Electronic computers. Computer science Schmidt, Desmond Colomb, Robert A data structure for representing multi-version texts online |
title | A data structure for representing multi-version texts online
|
title_full | A data structure for representing multi-version texts online
|
title_fullStr | A data structure for representing multi-version texts online
|
title_full_unstemmed | A data structure for representing multi-version texts online
|
title_short | A data structure for representing multi-version texts online
|
title_sort | data structure for representing multi version texts online |
topic | QA75 Electronic computers. Computer science |
work_keys_str_mv | AT schmidtdesmond adatastructureforrepresentingmultiversiontextsonline AT colombrobert adatastructureforrepresentingmultiversiontextsonline AT schmidtdesmond datastructureforrepresentingmultiversiontextsonline AT colombrobert datastructureforrepresentingmultiversiontextsonline |