Dynamic Data Citation Service—Subset Tool for Operational Data Management

In earth observation and climatological sciences, data and their data services grow on a daily basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but also by continuous meteorological in situ observations. In order to reuse such data, especially d...

Full description

Bibliographic Details
Main Authors:	Chris Schubert, Georg Seyerl, Katharina Sack
Format:	Article
Language:	English
Published:	MDPI AG 2019-08-01
Series:	Data
Subjects:	dynamic data citation subset data curation persistent identifier data provenance metadata versioning query store data sharing FAIR principles
Online Access:	https://www.mdpi.com/2306-5729/4/3/115

_version_	1811184693278670848
author	Chris Schubert Georg Seyerl Katharina Sack
author_facet	Chris Schubert Georg Seyerl Katharina Sack
author_sort	Chris Schubert
collection	DOAJ
description	In earth observation and climatological sciences, data and their data services grow on a daily basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but also by continuous meteorological in situ observations. In order to reuse such data, especially data fragments as well as their data services in a collaborative and reproducible manner by citing the origin source, data analysts, e.g., researchers or impact modelers, need a possibility to identify the exact version, precise time information, parameter, and names of the dataset used. A manual process would make the citation of data fragments as a subset of an entire dataset rather complex and imprecise to obtain. Data in climate research are in most cases multidimensional, structured grid data that can change partially over time. The citation of such evolving content requires the approach of “dynamic data citation”. The applied approach is based on associating queries with persistent identifiers. These queries contain the subsetting parameters, e.g., the spatial coordinates of the desired study area or the time frame with a start and end date, which are automatically included in the metadata of the newly generated subset and thus represent the information about the data history, the data provenance, which has to be established in data repository ecosystems. The Research Data Alliance Data Citation Working Group (RDA Data Citation WG) summarized the scientific status quo as well as the state of the art from existing citation and data management concepts and developed the scalable dynamic data citation methodology of evolving data. The Data Centre at the Climate Change Centre Austria (CCCA) has implemented the given recommendations and offers since 2017 an operational service on dynamic data citation on climate scenario data. With the consciousness that the objective of this topic brings a lot of dependencies on bibliographic citation research which is still under discussion, the CCCA service on Dynamic Data Citation focused on the climate domain specific issues, like characteristics of data, formats, software environment, and usage behavior. The current effort beyond spreading made experiences will be the scalability of the implementation, e.g., towards the potential of an Open Data Cube solution.
first_indexed	2024-04-11T13:17:47Z
format	Article
id	doaj.art-201d73a1add44340b6922d83bacff279
institution	Directory Open Access Journal
issn	2306-5729
language	English
last_indexed	2024-04-11T13:17:47Z
publishDate	2019-08-01
publisher	MDPI AG
record_format	Article
series	Data
spelling	doaj.art-201d73a1add44340b6922d83bacff2792022-12-22T04:22:20ZengMDPI AGData2306-57292019-08-014311510.3390/data4030115data4030115Dynamic Data Citation Service—Subset Tool for Operational Data ManagementChris Schubert0Georg Seyerl1Katharina Sack2Data Centre—Climate Change Centre Austria, 1190 Vienna, AustriaData Centre—Climate Change Centre Austria, 1190 Vienna, AustriaWU—Vienna University of Economics and Business, 1020 Vienna, AustriaIn earth observation and climatological sciences, data and their data services grow on a daily basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but also by continuous meteorological in situ observations. In order to reuse such data, especially data fragments as well as their data services in a collaborative and reproducible manner by citing the origin source, data analysts, e.g., researchers or impact modelers, need a possibility to identify the exact version, precise time information, parameter, and names of the dataset used. A manual process would make the citation of data fragments as a subset of an entire dataset rather complex and imprecise to obtain. Data in climate research are in most cases multidimensional, structured grid data that can change partially over time. The citation of such evolving content requires the approach of “dynamic data citation”. The applied approach is based on associating queries with persistent identifiers. These queries contain the subsetting parameters, e.g., the spatial coordinates of the desired study area or the time frame with a start and end date, which are automatically included in the metadata of the newly generated subset and thus represent the information about the data history, the data provenance, which has to be established in data repository ecosystems. The Research Data Alliance Data Citation Working Group (RDA Data Citation WG) summarized the scientific status quo as well as the state of the art from existing citation and data management concepts and developed the scalable dynamic data citation methodology of evolving data. The Data Centre at the Climate Change Centre Austria (CCCA) has implemented the given recommendations and offers since 2017 an operational service on dynamic data citation on climate scenario data. With the consciousness that the objective of this topic brings a lot of dependencies on bibliographic citation research which is still under discussion, the CCCA service on Dynamic Data Citation focused on the climate domain specific issues, like characteristics of data, formats, software environment, and usage behavior. The current effort beyond spreading made experiences will be the scalability of the implementation, e.g., towards the potential of an Open Data Cube solution.https://www.mdpi.com/2306-5729/4/3/115dynamic data citationsubsetdata curationpersistent identifierdata provenancemetadataversioningquery storedata sharingFAIR principles
spellingShingle	Chris Schubert Georg Seyerl Katharina Sack Dynamic Data Citation Service—Subset Tool for Operational Data Management Data dynamic data citation subset data curation persistent identifier data provenance metadata versioning query store data sharing FAIR principles
title	Dynamic Data Citation Service—Subset Tool for Operational Data Management
title_full	Dynamic Data Citation Service—Subset Tool for Operational Data Management
title_fullStr	Dynamic Data Citation Service—Subset Tool for Operational Data Management
title_full_unstemmed	Dynamic Data Citation Service—Subset Tool for Operational Data Management
title_short	Dynamic Data Citation Service—Subset Tool for Operational Data Management
title_sort	dynamic data citation service subset tool for operational data management
topic	dynamic data citation subset data curation persistent identifier data provenance metadata versioning query store data sharing FAIR principles
url	https://www.mdpi.com/2306-5729/4/3/115
work_keys_str_mv	AT chrisschubert dynamicdatacitationservicesubsettoolforoperationaldatamanagement AT georgseyerl dynamicdatacitationservicesubsettoolforoperationaldatamanagement AT katharinasack dynamicdatacitationservicesubsettoolforoperationaldatamanagement

Dynamic Data Citation Service—Subset Tool for Operational Data Management

Similar Items