What Should be Taught in an Academic Program of Data Sciences?

The new academic discipline of Data Sciences (DS) has been developed in recent years mainly because of the need to make decisions based on huge amounts of data -- Big Data. In parallel, there has been a huge progress in the development of technologies that enable to identify patterns, to filter big...

Full description

Bibliographic Details
Main Author: Niv Ahituv
Format: Article
Language:English
Published: Bulgarian Academy of Sciences, Institute of Mathematics and Informatics 2019-09-01
Series:Digital Presentation and Preservation of Cultural and Scientific Heritage
Subjects:
Online Access:https://dipp.math.bas.bg/dipp/article/view/174
_version_ 1811335956964311040
author Niv Ahituv
author_facet Niv Ahituv
author_sort Niv Ahituv
collection DOAJ
description The new academic discipline of Data Sciences (DS) has been developed in recent years mainly because of the need to make decisions based on huge amounts of data -- Big Data. In parallel, there has been a huge progress in the development of technologies that enable to identify patterns, to filter big data, and to provide relevant meanings to information, due to machine learning and sophisticated inference techniques. The profession of Data Scientist (or Data Analyst) has become highly demanded in recent years. It is required in the business sector where data is the “oxygen” for business survival; it is needed in the governmental sector in order to improve its services to the citizens; and it is very imperative in the scientific world, where large data depositories collected in varied disciplines have to be integrated, mined and analyzed, in order to enable interdisciplinary research. The purpose of this paper is to demonstrate how the scientific discipline of Data Sciences fits into academic programs intended to prepare data analysts for the business, public, government, and academic sectors. The article first delineates the Data Cycle, which portrays the transformation of data and their derivatives along the route from generation to decision making. The cycle includes the following stages: problem definition  identifying pertinent data sources  data collection, and storing (including cleansing and backup)  data integration  data mining  processing and analysis  visualization  learning and decision-making  feedback for future cycles. Within this cycle, there might be sub cycles, where a number of stages are repeated and reiterated. It should be noted that the data cycle is generic. It might have slight variations under various circumstances, however, there is not much difference between the scientific cycle and all the other cycles. Each stage within the cycle requires different tools, namely hardware and software technologies that support the stage. This article classifies these tools. The final part of the article suggests a typology for academic DS programs. It outlines an academic program that will be offered to those wishing to practice the Data Analyst profession. An introductory course that should be mandatory to all students campus-wide is sketched.
first_indexed 2024-04-13T17:32:17Z
format Article
id doaj.art-7bf0b1cdbf974e879300761dd44774a4
institution Directory Open Access Journal
issn 1314-4006
2535-0366
language English
last_indexed 2024-04-13T17:32:17Z
publishDate 2019-09-01
publisher Bulgarian Academy of Sciences, Institute of Mathematics and Informatics
record_format Article
series Digital Presentation and Preservation of Cultural and Scientific Heritage
spelling doaj.art-7bf0b1cdbf974e879300761dd44774a42022-12-22T02:37:31ZengBulgarian Academy of Sciences, Institute of Mathematics and InformaticsDigital Presentation and Preservation of Cultural and Scientific Heritage1314-40062535-03662019-09-01910.55630/dipp.2019.9.4What Should be Taught in an Academic Program of Data Sciences?Niv Ahituv0Coller School of Management, Tel Aviv University, Tel Aviv, IsraelThe new academic discipline of Data Sciences (DS) has been developed in recent years mainly because of the need to make decisions based on huge amounts of data -- Big Data. In parallel, there has been a huge progress in the development of technologies that enable to identify patterns, to filter big data, and to provide relevant meanings to information, due to machine learning and sophisticated inference techniques. The profession of Data Scientist (or Data Analyst) has become highly demanded in recent years. It is required in the business sector where data is the “oxygen” for business survival; it is needed in the governmental sector in order to improve its services to the citizens; and it is very imperative in the scientific world, where large data depositories collected in varied disciplines have to be integrated, mined and analyzed, in order to enable interdisciplinary research. The purpose of this paper is to demonstrate how the scientific discipline of Data Sciences fits into academic programs intended to prepare data analysts for the business, public, government, and academic sectors. The article first delineates the Data Cycle, which portrays the transformation of data and their derivatives along the route from generation to decision making. The cycle includes the following stages: problem definition  identifying pertinent data sources  data collection, and storing (including cleansing and backup)  data integration  data mining  processing and analysis  visualization  learning and decision-making  feedback for future cycles. Within this cycle, there might be sub cycles, where a number of stages are repeated and reiterated. It should be noted that the data cycle is generic. It might have slight variations under various circumstances, however, there is not much difference between the scientific cycle and all the other cycles. Each stage within the cycle requires different tools, namely hardware and software technologies that support the stage. This article classifies these tools. The final part of the article suggests a typology for academic DS programs. It outlines an academic program that will be offered to those wishing to practice the Data Analyst profession. An introductory course that should be mandatory to all students campus-wide is sketched.https://dipp.math.bas.bg/dipp/article/view/174Data SciencesBig DataData MiningAcademic Program in Data SciencesData Analyst
spellingShingle Niv Ahituv
What Should be Taught in an Academic Program of Data Sciences?
Digital Presentation and Preservation of Cultural and Scientific Heritage
Data Sciences
Big Data
Data Mining
Academic Program in Data Sciences
Data Analyst
title What Should be Taught in an Academic Program of Data Sciences?
title_full What Should be Taught in an Academic Program of Data Sciences?
title_fullStr What Should be Taught in an Academic Program of Data Sciences?
title_full_unstemmed What Should be Taught in an Academic Program of Data Sciences?
title_short What Should be Taught in an Academic Program of Data Sciences?
title_sort what should be taught in an academic program of data sciences
topic Data Sciences
Big Data
Data Mining
Academic Program in Data Sciences
Data Analyst
url https://dipp.math.bas.bg/dipp/article/view/174
work_keys_str_mv AT nivahituv whatshouldbetaughtinanacademicprogramofdatasciences