Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"

This archive contains the supplementary material for the journal article "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory", published in the Journal of Digital Scholarship in the Humanities (DSH), ca. 2016.The archi...

Full description

Bibliographic Details
Main Author: Finlayson, Mark Alan
Other Authors: Patrick Winston
Published: 2015
Online Access:http://hdl.handle.net/1721.1/100054
_version_ 1826199747446177792
author Finlayson, Mark Alan
author2 Patrick Winston
author_facet Patrick Winston
Finlayson, Mark Alan
author_sort Finlayson, Mark Alan
collection MIT
description This archive contains the supplementary material for the journal article "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory", published in the Journal of Digital Scholarship in the Humanities (DSH), ca. 2016.The archive contains several different types of files. First, it contains the annotation guides that were used to train the annotators. The guides are numbered to match the team numbers in Table 6. Included here are not only detailed guides for some layers, as produced by the original developers of the specification, but also our synopsis guides for each layer, which were used as a reference and further training material for the annotators. Also of interest are the general annotator and adjudicator training guides, which outline the general procedures followed by the teams when conducting annotation. Those who are organizing their own annotation projects may find this material useful.Second, the archive contains a comprehensive manifest, in Excel spreadsheet format, listing the word counts, sources, types, and titles (in both Russian and English) of all the texts that are part of the corpus. Finally, the archive contains the actual corpus data files, in Story Workbench format, an XML-encoded stand-off annotation scheme. The scheme is described in the file format specification file, also included in the archive. These files can be parsed with the aid of any normal XML reading software, or can be loaded and edited easily with the Story Workbench annotation tool, also freely available.
first_indexed 2024-09-23T11:25:29Z
id mit-1721.1/100054
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T11:25:29Z
publishDate 2015
record_format dspace
spelling mit-1721.1/1000542019-04-08T07:36:30Z Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory" Finlayson, Mark Alan Patrick Winston Genesis Patrick Winston Genesis This archive contains the supplementary material for the journal article "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory", published in the Journal of Digital Scholarship in the Humanities (DSH), ca. 2016.The archive contains several different types of files. First, it contains the annotation guides that were used to train the annotators. The guides are numbered to match the team numbers in Table 6. Included here are not only detailed guides for some layers, as produced by the original developers of the specification, but also our synopsis guides for each layer, which were used as a reference and further training material for the annotators. Also of interest are the general annotator and adjudicator training guides, which outline the general procedures followed by the teams when conducting annotation. Those who are organizing their own annotation projects may find this material useful.Second, the archive contains a comprehensive manifest, in Excel spreadsheet format, listing the word counts, sources, types, and titles (in both Russian and English) of all the texts that are part of the corpus. Finally, the archive contains the actual corpus data files, in Story Workbench format, an XML-encoded stand-off annotation scheme. The scheme is described in the file format specification file, also included in the archive. These files can be parsed with the aid of any normal XML reading software, or can be loaded and edited easily with the Story Workbench annotation tool, also freely available. 2015-12-03T16:30:05Z 2015-12-03T16:30:05Z 2015-12-02 2015-12-03T16:30:05Z http://hdl.handle.net/1721.1/100054 Creative Commons Attribution 4.0 International http://creativecommons.org/licenses/by/4.0/ 8341 KiB application/zip
spellingShingle Finlayson, Mark Alan
Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"
title Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"
title_full Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"
title_fullStr Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"
title_full_unstemmed Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"
title_short Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"
title_sort supplementary materials for propplearner deeply annotating a corpus of russian folktales to enable the machine learning of a russian formalist theory
url http://hdl.handle.net/1721.1/100054
work_keys_str_mv AT finlaysonmarkalan supplementarymaterialsforpropplearnerdeeplyannotatingacorpusofrussianfolktalestoenablethemachinelearningofarussianformalisttheory