Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance
We illustrate how combining retrospective and prospectiveprovenance can yield scientifically meaningful hybrid provenancerepresentations of the computational histories of data produced during a script run. We use scripts from multiple disciplines (astrophysics, climate science, biodiversity data cur...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Edinburgh
2018-08-01
|
Series: | International Journal of Digital Curation |
Online Access: | http://www.ijdc.net/article/view/585 |
_version_ | 1818527780943429632 |
---|---|
author | Qian Zhang Yang Cao Qiwen Wang Duc Vu Priyaa Thavasimani Timothy McPhillips Paolo Missier Peter Slaughter Christopher Jones Matthew B. Jones Bertram Ludäscher |
author_facet | Qian Zhang Yang Cao Qiwen Wang Duc Vu Priyaa Thavasimani Timothy McPhillips Paolo Missier Peter Slaughter Christopher Jones Matthew B. Jones Bertram Ludäscher |
author_sort | Qian Zhang |
collection | DOAJ |
description | We illustrate how combining retrospective and prospectiveprovenance can yield scientifically meaningful hybrid provenancerepresentations of the computational histories of data produced during a script run. We use scripts from multiple disciplines (astrophysics, climate science, biodiversity data curation, and social network analysis), implemented in Python, R, and MATLAB, to highlight the usefulness of diverse forms of retrospectiveprovenance when coupled with prospectiveprovenance. Users provide prospective provenance, i.e., the conceptual workflows latent in scripts, via simple YesWorkflow annotations, embedded as script comments. Runtime observables can be linked to prospective provenance via relational views and queries. These observables could be found hidden in filenames or folder structures, be recorded in log files, or they can be automatically captured using tools such as noWorkflow or the DataONE RunManagers. The YesWorkflow toolkit, example scripts, and demonstration code are available via an open source repository. |
first_indexed | 2024-12-11T06:40:49Z |
format | Article |
id | doaj.art-89261fc52aa747f3a1ac87c9eeeab21f |
institution | Directory Open Access Journal |
issn | 1746-8256 |
language | English |
last_indexed | 2024-12-11T06:40:49Z |
publishDate | 2018-08-01 |
publisher | University of Edinburgh |
record_format | Article |
series | International Journal of Digital Curation |
spelling | doaj.art-89261fc52aa747f3a1ac87c9eeeab21f2022-12-22T01:17:15ZengUniversity of EdinburghInternational Journal of Digital Curation1746-82562018-08-0112210.2218/ijdc.v12i2.585Revealing the Detailed Lineage of Script Outputs Using Hybrid ProvenanceQian Zhang0Yang Cao1Qiwen Wang2Duc Vu3Priyaa Thavasimani4Timothy McPhillips5Paolo Missier6Peter Slaughter7Christopher Jones8Matthew B. Jones9Bertram Ludäscher10University of Illinois at Urbana-ChampaignUniversity of Illinois at Urbana-ChampaignUniversity of Illinois at Urbana-ChampaignUniversity of Illinois at ChicagoNewcastle UniversityUniversity of Illinois at Urbana-ChampaignNewcastle UniversityUniversity of California, Santa BarbaraUniversity of California, Santa BarbaraUniversity of California, Santa BarbaraUniversity of Illinois at Urbana-ChampaignWe illustrate how combining retrospective and prospectiveprovenance can yield scientifically meaningful hybrid provenancerepresentations of the computational histories of data produced during a script run. We use scripts from multiple disciplines (astrophysics, climate science, biodiversity data curation, and social network analysis), implemented in Python, R, and MATLAB, to highlight the usefulness of diverse forms of retrospectiveprovenance when coupled with prospectiveprovenance. Users provide prospective provenance, i.e., the conceptual workflows latent in scripts, via simple YesWorkflow annotations, embedded as script comments. Runtime observables can be linked to prospective provenance via relational views and queries. These observables could be found hidden in filenames or folder structures, be recorded in log files, or they can be automatically captured using tools such as noWorkflow or the DataONE RunManagers. The YesWorkflow toolkit, example scripts, and demonstration code are available via an open source repository.http://www.ijdc.net/article/view/585 |
spellingShingle | Qian Zhang Yang Cao Qiwen Wang Duc Vu Priyaa Thavasimani Timothy McPhillips Paolo Missier Peter Slaughter Christopher Jones Matthew B. Jones Bertram Ludäscher Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance International Journal of Digital Curation |
title | Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance |
title_full | Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance |
title_fullStr | Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance |
title_full_unstemmed | Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance |
title_short | Revealing the Detailed Lineage of Script Outputs Using Hybrid Provenance |
title_sort | revealing the detailed lineage of script outputs using hybrid provenance |
url | http://www.ijdc.net/article/view/585 |
work_keys_str_mv | AT qianzhang revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT yangcao revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT qiwenwang revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT ducvu revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT priyaathavasimani revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT timothymcphillips revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT paolomissier revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT peterslaughter revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT christopherjones revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT matthewbjones revealingthedetailedlineageofscriptoutputsusinghybridprovenance AT bertramludascher revealingthedetailedlineageofscriptoutputsusinghybridprovenance |