SC2EGSet: StarCraft II Esport Replay and Game-state Dataset

Abstract As a relatively new form of sport, esports offers unparalleled data availability. Our work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments. These files can be used in statistical and machine learning model...

Full description

Bibliographic Details
Main Authors: Andrzej Białecki, Natalia Jakubowska, Paweł Dobrowolski, Piotr Białecki, Leszek Krupiński, Andrzej Szczap, Robert Białecki, Jan Gajewski
Format: Article
Language:English
Published: Nature Portfolio 2023-09-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-023-02510-7
_version_ 1797453971859701760
author Andrzej Białecki
Natalia Jakubowska
Paweł Dobrowolski
Piotr Białecki
Leszek Krupiński
Andrzej Szczap
Robert Białecki
Jan Gajewski
author_facet Andrzej Białecki
Natalia Jakubowska
Paweł Dobrowolski
Piotr Białecki
Leszek Krupiński
Andrzej Szczap
Robert Białecki
Jan Gajewski
author_sort Andrzej Białecki
collection DOAJ
description Abstract As a relatively new form of sport, esports offers unparalleled data availability. Our work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments. These files can be used in statistical and machine learning modeling tasks and compared to laboratory-based measurements. Additionally, we open-sourced and published all the custom tools that were developed in the process of creating our dataset. These tools include PyTorch and PyTorch Lightning API abstractions to load and model the data. Our dataset contains replays from major and premiere StarCraft II tournaments since 2016. We processed 55 “replaypacks” that contained 17930 files with game-state information. Our dataset is one of the few large publicly available sources of StarCraft II data upon its publication. Analysis of the extracted data holds promise for further Artificial Intelligence (AI), Machine Learning (ML), psychological, Human-Computer Interaction (HCI), and sports-related studies in a variety of supervised and self-supervised tasks.
first_indexed 2024-03-09T15:30:35Z
format Article
id doaj.art-d21589c33ca64ffcaaac7a304f94ada0
institution Directory Open Access Journal
issn 2052-4463
language English
last_indexed 2024-03-09T15:30:35Z
publishDate 2023-09-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj.art-d21589c33ca64ffcaaac7a304f94ada02023-11-26T12:17:56ZengNature PortfolioScientific Data2052-44632023-09-0110111210.1038/s41597-023-02510-7SC2EGSet: StarCraft II Esport Replay and Game-state DatasetAndrzej Białecki0Natalia Jakubowska1Paweł Dobrowolski2Piotr BiałeckiLeszek KrupińskiAndrzej Szczap3Robert Białecki4Jan Gajewski5Warsaw University of Technology, Electronics and Information TechnologySWPS University, Neurocognitive Research CenterPolish Academy of Sciences, Institute of PsychologyAdam Mickiewicz University in Poznań, Mathematics and Computer ScienceJózef Piłsudski University of Physical Education in Warsaw, Physical EducationJózef Piłsudski University of Physical Education in Warsaw, Physical EducationAbstract As a relatively new form of sport, esports offers unparalleled data availability. Our work aims to open esports to a broader scientific community by supplying raw and pre-processed files from StarCraft II esports tournaments. These files can be used in statistical and machine learning modeling tasks and compared to laboratory-based measurements. Additionally, we open-sourced and published all the custom tools that were developed in the process of creating our dataset. These tools include PyTorch and PyTorch Lightning API abstractions to load and model the data. Our dataset contains replays from major and premiere StarCraft II tournaments since 2016. We processed 55 “replaypacks” that contained 17930 files with game-state information. Our dataset is one of the few large publicly available sources of StarCraft II data upon its publication. Analysis of the extracted data holds promise for further Artificial Intelligence (AI), Machine Learning (ML), psychological, Human-Computer Interaction (HCI), and sports-related studies in a variety of supervised and self-supervised tasks.https://doi.org/10.1038/s41597-023-02510-7
spellingShingle Andrzej Białecki
Natalia Jakubowska
Paweł Dobrowolski
Piotr Białecki
Leszek Krupiński
Andrzej Szczap
Robert Białecki
Jan Gajewski
SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
Scientific Data
title SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
title_full SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
title_fullStr SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
title_full_unstemmed SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
title_short SC2EGSet: StarCraft II Esport Replay and Game-state Dataset
title_sort sc2egset starcraft ii esport replay and game state dataset
url https://doi.org/10.1038/s41597-023-02510-7
work_keys_str_mv AT andrzejbiałecki sc2egsetstarcraftiiesportreplayandgamestatedataset
AT nataliajakubowska sc2egsetstarcraftiiesportreplayandgamestatedataset
AT pawełdobrowolski sc2egsetstarcraftiiesportreplayandgamestatedataset
AT piotrbiałecki sc2egsetstarcraftiiesportreplayandgamestatedataset
AT leszekkrupinski sc2egsetstarcraftiiesportreplayandgamestatedataset
AT andrzejszczap sc2egsetstarcraftiiesportreplayandgamestatedataset
AT robertbiałecki sc2egsetstarcraftiiesportreplayandgamestatedataset
AT jangajewski sc2egsetstarcraftiiesportreplayandgamestatedataset