Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures
The digital preservation community currently utilises a number of tools and automated processes to identify and validate digital objects. The identification of digital objects is a vital first step in their long-term preservation, but the results returned by tools used for this purpose are lacking i...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Edinburgh
2012-03-01
|
Series: | International Journal of Digital Curation |
Online Access: | https://129.215.67.1/ijdc/article/view/211 |
_version_ | 1797393150067605504 |
---|---|
author | Andrew Fetherston Tim Gollins |
author_facet | Andrew Fetherston Tim Gollins |
author_sort | Andrew Fetherston |
collection | DOAJ |
description | The digital preservation community currently utilises a number of tools and automated processes to identify and validate digital objects. The identification of digital objects is a vital first step in their long-term preservation, but the results returned by tools used for this purpose are lacking in transparency, and are not easily tested or verified. This paper suggests that a test corpus of digital objects is one way of providing this verification and validation, ultimately improving trust in the tools, and providing further stimulus to their development. Issues to be considered are outlined, and attention is drawn to particular examples of existing digital corpora which could conceivably provide a useable framework or starting point for our own communities needs. This paper does not seek to answer all questions in this area, but merely attempts to set out areas for consideration in any next step that is taken. |
first_indexed | 2024-03-08T23:59:01Z |
format | Article |
id | doaj.art-af1c407998df4ce6922eca6fba4a36d6 |
institution | Directory Open Access Journal |
issn | 1746-8256 |
language | English |
last_indexed | 2024-03-08T23:59:01Z |
publishDate | 2012-03-01 |
publisher | University of Edinburgh |
record_format | Article |
series | International Journal of Digital Curation |
spelling | doaj.art-af1c407998df4ce6922eca6fba4a36d62023-12-12T23:52:08ZengUniversity of EdinburghInternational Journal of Digital Curation1746-82562012-03-0171Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and SignaturesAndrew FetherstonTim GollinsThe digital preservation community currently utilises a number of tools and automated processes to identify and validate digital objects. The identification of digital objects is a vital first step in their long-term preservation, but the results returned by tools used for this purpose are lacking in transparency, and are not easily tested or verified. This paper suggests that a test corpus of digital objects is one way of providing this verification and validation, ultimately improving trust in the tools, and providing further stimulus to their development. Issues to be considered are outlined, and attention is drawn to particular examples of existing digital corpora which could conceivably provide a useable framework or starting point for our own communities needs. This paper does not seek to answer all questions in this area, but merely attempts to set out areas for consideration in any next step that is taken.https://129.215.67.1/ijdc/article/view/211 |
spellingShingle | Andrew Fetherston Tim Gollins Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures International Journal of Digital Curation |
title | Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures |
title_full | Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures |
title_fullStr | Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures |
title_full_unstemmed | Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures |
title_short | Towards the Development of a Test Corpus of Digital Objects for the Evaluation of File Format Identification Tools and Signatures |
title_sort | towards the development of a test corpus of digital objects for the evaluation of file format identification tools and signatures |
url | https://129.215.67.1/ijdc/article/view/211 |
work_keys_str_mv | AT andrewfetherston towardsthedevelopmentofatestcorpusofdigitalobjectsfortheevaluationoffileformatidentificationtoolsandsignatures AT timgollins towardsthedevelopmentofatestcorpusofdigitalobjectsfortheevaluationoffileformatidentificationtoolsandsignatures |