Open science and Big Data in South Africa

With the Square Kilometer Array (SKA) project and the new Multi-Purpose Reactor (MPR) soon coming on-line, South Africa and other collaborating countries in Africa will need to make the management, analysis, publication, and curation of “Big Scientific Data” a priority. In addition, the recent draft...

Full description

Bibliographic Details
Main Author: Tony Hey
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-11-01
Series:Frontiers in Research Metrics and Analytics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frma.2022.982435/full
_version_ 1811223759933145088
author Tony Hey
author_facet Tony Hey
author_sort Tony Hey
collection DOAJ
description With the Square Kilometer Array (SKA) project and the new Multi-Purpose Reactor (MPR) soon coming on-line, South Africa and other collaborating countries in Africa will need to make the management, analysis, publication, and curation of “Big Scientific Data” a priority. In addition, the recent draft Open Science policy from the South African Department of Science and Innovation (DSI) requires both Open Access to scholarly publications and research outputs, and an Open Data policy that facilitates equal opportunity of access to research data. The policy also endorses the deposit, discovery and dissemination of data and metadata in a manner consistent with the FAIR principles – making data Findable, Accessible, Interoperable and Re-usable (FAIR). The challenge to achieve Open Science in Africa starts with open access for research publications and the provision of persistent links to the supporting data. With the deluge of research data expected from the new experimental facilities in South Africa, the problem of how to make such data FAIR takes center stage. One promising approach to make such scientific datasets more “Findable” and “Interoperable” is to rely on the Dataset representation of the Schema.org vocabulary which has been endorsed by all the major search engines. The approach adds some semantic markup to Web pages and makes scientific datasets more “Findable” by search engines. This paper does not address all aspects of the Open Science agenda but instead is focused on the management and analysis challenges of the “Big Scientific Data” that will be produced by the SKA project. The paper summarizes the role of the SKA Regional Centers (SRCs) and then discusses the goal of ensuring reproducibility for the SKA data products. Experiments at the new MPR neutron source will also have to conform to the DSI's Open Science policy. The Open Science and FAIR data practices used at the ISIS Neutron source at the Rutherford Appleton Laboratory in the UK are then briefly described. The paper concludes with some remarks about the important role of interdisciplinary teams of research software engineers, data engineers and research librarians in research data management.
first_indexed 2024-04-12T08:37:57Z
format Article
id doaj.art-dd9cd4e5b9c1460fac304e05cff2e88a
institution Directory Open Access Journal
issn 2504-0537
language English
last_indexed 2024-04-12T08:37:57Z
publishDate 2022-11-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Research Metrics and Analytics
spelling doaj.art-dd9cd4e5b9c1460fac304e05cff2e88a2022-12-22T03:39:58ZengFrontiers Media S.A.Frontiers in Research Metrics and Analytics2504-05372022-11-01710.3389/frma.2022.982435982435Open science and Big Data in South AfricaTony HeyWith the Square Kilometer Array (SKA) project and the new Multi-Purpose Reactor (MPR) soon coming on-line, South Africa and other collaborating countries in Africa will need to make the management, analysis, publication, and curation of “Big Scientific Data” a priority. In addition, the recent draft Open Science policy from the South African Department of Science and Innovation (DSI) requires both Open Access to scholarly publications and research outputs, and an Open Data policy that facilitates equal opportunity of access to research data. The policy also endorses the deposit, discovery and dissemination of data and metadata in a manner consistent with the FAIR principles – making data Findable, Accessible, Interoperable and Re-usable (FAIR). The challenge to achieve Open Science in Africa starts with open access for research publications and the provision of persistent links to the supporting data. With the deluge of research data expected from the new experimental facilities in South Africa, the problem of how to make such data FAIR takes center stage. One promising approach to make such scientific datasets more “Findable” and “Interoperable” is to rely on the Dataset representation of the Schema.org vocabulary which has been endorsed by all the major search engines. The approach adds some semantic markup to Web pages and makes scientific datasets more “Findable” by search engines. This paper does not address all aspects of the Open Science agenda but instead is focused on the management and analysis challenges of the “Big Scientific Data” that will be produced by the SKA project. The paper summarizes the role of the SKA Regional Centers (SRCs) and then discusses the goal of ensuring reproducibility for the SKA data products. Experiments at the new MPR neutron source will also have to conform to the DSI's Open Science policy. The Open Science and FAIR data practices used at the ISIS Neutron source at the Rutherford Appleton Laboratory in the UK are then briefly described. The paper concludes with some remarks about the important role of interdisciplinary teams of research software engineers, data engineers and research librarians in research data management.https://www.frontiersin.org/articles/10.3389/frma.2022.982435/fullOpen ScienceSKA projectFAIR dataneutron dataresearch data management (RDM)
spellingShingle Tony Hey
Open science and Big Data in South Africa
Frontiers in Research Metrics and Analytics
Open Science
SKA project
FAIR data
neutron data
research data management (RDM)
title Open science and Big Data in South Africa
title_full Open science and Big Data in South Africa
title_fullStr Open science and Big Data in South Africa
title_full_unstemmed Open science and Big Data in South Africa
title_short Open science and Big Data in South Africa
title_sort open science and big data in south africa
topic Open Science
SKA project
FAIR data
neutron data
research data management (RDM)
url https://www.frontiersin.org/articles/10.3389/frma.2022.982435/full
work_keys_str_mv AT tonyhey openscienceandbigdatainsouthafrica