“Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript

The article reflects the current trends in working with the digital heritage of Russian literature, examines the process of forming virtual archives as a gradual accumulation of the “big data” of scientific research, i. e. unrecognized information array of raster documents containing tens of thousan...

Full description

Bibliographic Details
Main Author: Lyubov V. Khachaturian
Format: Article
Language:English
Published: A.M. Gorky Institute of World Literature of the Russian Academy of Sciences 2023-06-01
Series:Studia Litterarum
Subjects:
Online Access:https://studlit.ru/images/2023-8-2/16_Khachaturian_334-349.pdf
_version_ 1797809589380448256
author Lyubov V. Khachaturian
author_facet Lyubov V. Khachaturian
author_sort Lyubov V. Khachaturian
collection DOAJ
description The article reflects the current trends in working with the digital heritage of Russian literature, examines the process of forming virtual archives as a gradual accumulation of the “big data” of scientific research, i. e. unrecognized information array of raster documents containing tens of thousands of images. The research analyzes the specifics of scientific work in the field of ego-documentary heritage that arose at the turn of the 20th – 21st centuries (a corpus of diary entries, workbooks, notebooks, correspondence), the principles of publication and modern standards of digitization of archival heritage. The study and practicing of the three most promising virtual resources on the history of Russian literature of the mid-19th – first half of the 20th centuries allows to formulate specific tasks and methods of visualization of a large corpus of raster images of archival documents, as well as previously untapped possibilities of search engine automation. Much attention is paid to the transition from the graphical elements of the raster image of the manuscript to semantic ones, which allow the use of data mining elements for an unrecognized data array.
first_indexed 2024-03-13T06:56:02Z
format Article
id doaj.art-924b6b54efc54a36b9ad49019b747e62
institution Directory Open Access Journal
issn 2500-4247
2541-8564
language English
last_indexed 2024-03-13T06:56:02Z
publishDate 2023-06-01
publisher A.M. Gorky Institute of World Literature of the Russian Academy of Sciences
record_format Article
series Studia Litterarum
spelling doaj.art-924b6b54efc54a36b9ad49019b747e622023-06-07T11:35:49ZengA.M. Gorky Institute of World Literature of the Russian Academy of SciencesStudia Litterarum2500-42472541-85642023-06-018233434910.22455/2500-4247-2023-8-2-334-349“Big Data” of the Digital Archive: A Dialogue with a Raster ManuscriptLyubov V. Khachaturian0https://orcid.org/0000-0002-2689-5186National Research University Higher School of Economics, Moscow, RussiaThe article reflects the current trends in working with the digital heritage of Russian literature, examines the process of forming virtual archives as a gradual accumulation of the “big data” of scientific research, i. e. unrecognized information array of raster documents containing tens of thousands of images. The research analyzes the specifics of scientific work in the field of ego-documentary heritage that arose at the turn of the 20th – 21st centuries (a corpus of diary entries, workbooks, notebooks, correspondence), the principles of publication and modern standards of digitization of archival heritage. The study and practicing of the three most promising virtual resources on the history of Russian literature of the mid-19th – first half of the 20th centuries allows to formulate specific tasks and methods of visualization of a large corpus of raster images of archival documents, as well as previously untapped possibilities of search engine automation. Much attention is paid to the transition from the graphical elements of the raster image of the manuscript to semantic ones, which allow the use of data mining elements for an unrecognized data array.https://studlit.ru/images/2023-8-2/16_Khachaturian_334-349.pdfego-documentary heritagearchival materialsdigital archiverussian literature of the 20th centuryhandwritten heritagebig datadata mining
spellingShingle Lyubov V. Khachaturian
“Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript
Studia Litterarum
ego-documentary heritage
archival materials
digital archive
russian literature of the 20th century
handwritten heritage
big data
data mining
title “Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript
title_full “Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript
title_fullStr “Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript
title_full_unstemmed “Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript
title_short “Big Data” of the Digital Archive: A Dialogue with a Raster Manuscript
title_sort big data of the digital archive a dialogue with a raster manuscript
topic ego-documentary heritage
archival materials
digital archive
russian literature of the 20th century
handwritten heritage
big data
data mining
url https://studlit.ru/images/2023-8-2/16_Khachaturian_334-349.pdf
work_keys_str_mv AT lyubovvkhachaturian bigdataofthedigitalarchiveadialoguewitharastermanuscript