Grammar-Based Specification and Parsing of Binary File Formats
The capability to validate and view or play binary file formats, as well as to convert binary file formats to standard or current file formats, is critically important to the preservation of digital data and records. This paper describes the extension of context-free grammars from strings to binary...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Edinburgh
2012-03-01
|
Series: | International Journal of Digital Curation |
Online Access: | https://ijdc.net/index.php/ijdc/article/view/217 |
_version_ | 1797323814027132928 |
---|---|
author | William Underwood |
author_facet | William Underwood |
author_sort | William Underwood |
collection | DOAJ |
description | The capability to validate and view or play binary file formats, as well as to convert binary file formats to standard or current file formats, is critically important to the preservation of digital data and records. This paper describes the extension of context-free grammars from strings to binary files. Binary files are arrays of data types, such as long and short integers, floating-point numbers and pointers, as well as characters. The concept of an attribute grammar is extended to these context-free array grammars. This attribute grammar has been used to define a number of chunk-based and directory-based binary file formats. A parser generator has been used with some of these grammars to generate syntax checkers (recognizers) for validating binary file formats. Among the potential benefits of an attribute grammar-based approach to specification and parsing of binary file formats is that attribute grammars not only support format validation, but support generation of error messages during validation of format, validation of semantic constraints, attribute value extraction (characterization), generation of viewers or players for file formats, and conversion to current or standard file formats. The significance of these results is that with these extensions to core computer science concepts, traditional parser/compiler technologies can potentially be used as a part of a general, cost effective curation strategy for binary file formats. |
first_indexed | 2024-03-08T05:34:33Z |
format | Article |
id | doaj.art-03f6ec0e9f8443dcb68341b2f9bd1bb2 |
institution | Directory Open Access Journal |
issn | 1746-8256 |
language | English |
last_indexed | 2024-03-08T05:34:33Z |
publishDate | 2012-03-01 |
publisher | University of Edinburgh |
record_format | Article |
series | International Journal of Digital Curation |
spelling | doaj.art-03f6ec0e9f8443dcb68341b2f9bd1bb22024-02-06T00:07:07ZengUniversity of EdinburghInternational Journal of Digital Curation1746-82562012-03-0171Grammar-Based Specification and Parsing of Binary File FormatsWilliam UnderwoodThe capability to validate and view or play binary file formats, as well as to convert binary file formats to standard or current file formats, is critically important to the preservation of digital data and records. This paper describes the extension of context-free grammars from strings to binary files. Binary files are arrays of data types, such as long and short integers, floating-point numbers and pointers, as well as characters. The concept of an attribute grammar is extended to these context-free array grammars. This attribute grammar has been used to define a number of chunk-based and directory-based binary file formats. A parser generator has been used with some of these grammars to generate syntax checkers (recognizers) for validating binary file formats. Among the potential benefits of an attribute grammar-based approach to specification and parsing of binary file formats is that attribute grammars not only support format validation, but support generation of error messages during validation of format, validation of semantic constraints, attribute value extraction (characterization), generation of viewers or players for file formats, and conversion to current or standard file formats. The significance of these results is that with these extensions to core computer science concepts, traditional parser/compiler technologies can potentially be used as a part of a general, cost effective curation strategy for binary file formats.https://ijdc.net/index.php/ijdc/article/view/217 |
spellingShingle | William Underwood Grammar-Based Specification and Parsing of Binary File Formats International Journal of Digital Curation |
title | Grammar-Based Specification and Parsing of Binary File Formats |
title_full | Grammar-Based Specification and Parsing of Binary File Formats |
title_fullStr | Grammar-Based Specification and Parsing of Binary File Formats |
title_full_unstemmed | Grammar-Based Specification and Parsing of Binary File Formats |
title_short | Grammar-Based Specification and Parsing of Binary File Formats |
title_sort | grammar based specification and parsing of binary file formats |
url | https://ijdc.net/index.php/ijdc/article/view/217 |
work_keys_str_mv | AT williamunderwood grammarbasedspecificationandparsingofbinaryfileformats |