A collaborative semantic-based provenance management platform for reproducibility

Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing p...

Full description

Bibliographic Details
Main Authors: Sheeba Samuel, Birgitta König-Ries
Format: Article
Language:English
Published: PeerJ Inc. 2022-03-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-921.pdf
_version_ 1819120039597441024
author Sheeba Samuel
Birgitta König-Ries
author_facet Sheeba Samuel
Birgitta König-Ries
author_sort Sheeba Samuel
collection DOAJ
description Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing provenance helps in the understandability, reproducibility, and reuse of experiments for the scientific community. Current systems lack a link between the data, steps, and results from the computational and non-computational processes of an experiment. Such a link, however, is vital for the reproducibility of results. We present a novel solution for the end-to-end provenance management of scientific experiments. We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility), which allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational data and steps in an interoperable way. CAESAR integrates the REPRODUCE-ME provenance model, extended from existing semantic web standards, to represent the whole picture of an experiment describing the path it took from its design to its result. ProvBook, an extension for Jupyter Notebooks, is developed and integrated into CAESAR to support computational reproducibility. We have applied and evaluated our contributions to a set of scientific experiments in microscopy research projects.
first_indexed 2024-12-22T06:14:20Z
format Article
id doaj.art-d179d3c8a166453ea00edf4d67c035f9
institution Directory Open Access Journal
issn 2376-5992
language English
last_indexed 2024-12-22T06:14:20Z
publishDate 2022-03-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj.art-d179d3c8a166453ea00edf4d67c035f92022-12-21T18:36:08ZengPeerJ Inc.PeerJ Computer Science2376-59922022-03-018e92110.7717/peerj-cs.921A collaborative semantic-based provenance management platform for reproducibilitySheeba Samuel0Birgitta König-Ries1Michael Stifel Center Jena, Jena, GermanyMichael Stifel Center Jena, Jena, GermanyScientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing provenance helps in the understandability, reproducibility, and reuse of experiments for the scientific community. Current systems lack a link between the data, steps, and results from the computational and non-computational processes of an experiment. Such a link, however, is vital for the reproducibility of results. We present a novel solution for the end-to-end provenance management of scientific experiments. We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility), which allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational data and steps in an interoperable way. CAESAR integrates the REPRODUCE-ME provenance model, extended from existing semantic web standards, to represent the whole picture of an experiment describing the path it took from its design to its result. ProvBook, an extension for Jupyter Notebooks, is developed and integrated into CAESAR to support computational reproducibility. We have applied and evaluated our contributions to a set of scientific experiments in microscopy research projects.https://peerj.com/articles/cs-921.pdfProvenanceReproducibilityResearch data management platformJupyter NotebooksScientific experimentsOntology
spellingShingle Sheeba Samuel
Birgitta König-Ries
A collaborative semantic-based provenance management platform for reproducibility
PeerJ Computer Science
Provenance
Reproducibility
Research data management platform
Jupyter Notebooks
Scientific experiments
Ontology
title A collaborative semantic-based provenance management platform for reproducibility
title_full A collaborative semantic-based provenance management platform for reproducibility
title_fullStr A collaborative semantic-based provenance management platform for reproducibility
title_full_unstemmed A collaborative semantic-based provenance management platform for reproducibility
title_short A collaborative semantic-based provenance management platform for reproducibility
title_sort collaborative semantic based provenance management platform for reproducibility
topic Provenance
Reproducibility
Research data management platform
Jupyter Notebooks
Scientific experiments
Ontology
url https://peerj.com/articles/cs-921.pdf
work_keys_str_mv AT sheebasamuel acollaborativesemanticbasedprovenancemanagementplatformforreproducibility
AT birgittakonigries acollaborativesemanticbasedprovenancemanagementplatformforreproducibility
AT sheebasamuel collaborativesemanticbasedprovenancemanagementplatformforreproducibility
AT birgittakonigries collaborativesemanticbasedprovenancemanagementplatformforreproducibility