Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles
Introduction: Digital preservation underpins the persistence of scholarly links and citations through the digital object identifier (DOI) system. We do not currently know, at scale, the extent to which articles assigned a DOI are adequately preserved. Methods: We construct a database of pre...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Iowa State University Digital Press
2024-01-01
|
Series: | Journal of Librarianship and Scholarly Communication |
Subjects: | |
Online Access: | https://www.iastatedigitalpress.com/jlsc/article/id/16288/ |
_version_ | 1797222148309254144 |
---|---|
author | Martin Paul Eve |
author_facet | Martin Paul Eve |
author_sort | Martin Paul Eve |
collection | DOAJ |
description | Introduction: Digital preservation underpins the persistence of scholarly links and citations through the digital object identifier (DOI) system. We do not currently know, at scale, the extent to which articles assigned a DOI are adequately preserved. Methods: We construct a database of preservation information from original archival sources and then examine the preservation statuses of 7,438,037 DOIs in a random sample. Results: Of the 7,438,037 works examined, there were 5.9 million copies spread over the archives used in this work. Furthermore, a total of 4,342,368 of the works that we studied (58.38%) were present in at least one archive. However, this left 2,056,492 works in our sample (27.64%) that are seemingly unpreserved. The remaining 13.98% of works in the sample were excluded either for being too recent (published in the current year), not being journal articles, or having insufficient date metadata for us to identify the source. Discussion: Our study is limited by design in several ways. Among these are the facts that it uses only a subset of archives, it only tracks articles with DOIs, and it does not account for institutional repository coverage. Nonetheless, as an initial attempt to gauge the landscape, our results will still be of interest to libraries, publishers, and researchers. Conclusion: This work reveals an alarming preservation deficit. Only 0.96% of Crossref members (n = 204) can be confirmed to digitally preserve over 75% of their content in three or more of the archives that we studied. (Note that when, in this article, we write “preserved,” we mean “that we were able to confirm as preserved,” as per the specified limitations of this study.) A slightly larger proportion, i.e., 8.5% (n = 1,797), preserved over 50% of their content in two or more archives. However, many members, i.e., 57.7% (n = 12,257), only met the threshold of having 25% of their material in a single archive. Most worryingly, 32.9% (n = 6,982) of Crossref members seem not to have any adequate digital preservation in place, which is against the recommendations of the Digital Preservation Coalition. |
first_indexed | 2024-04-24T13:16:43Z |
format | Article |
id | doaj.art-f9e13f0653e342dfa824635547ebff9e |
institution | Directory Open Access Journal |
issn | 2162-3309 |
language | English |
last_indexed | 2024-04-24T13:16:43Z |
publishDate | 2024-01-01 |
publisher | Iowa State University Digital Press |
record_format | Article |
series | Journal of Librarianship and Scholarly Communication |
spelling | doaj.art-f9e13f0653e342dfa824635547ebff9e2024-04-04T17:35:55ZengIowa State University Digital PressJournal of Librarianship and Scholarly Communication2162-33092024-01-0112110.31274/jlsc.16288Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million ArticlesMartin Paul Eve0Crossref and Birkbeck, University of LondonIntroduction: Digital preservation underpins the persistence of scholarly links and citations through the digital object identifier (DOI) system. We do not currently know, at scale, the extent to which articles assigned a DOI are adequately preserved. Methods: We construct a database of preservation information from original archival sources and then examine the preservation statuses of 7,438,037 DOIs in a random sample. Results: Of the 7,438,037 works examined, there were 5.9 million copies spread over the archives used in this work. Furthermore, a total of 4,342,368 of the works that we studied (58.38%) were present in at least one archive. However, this left 2,056,492 works in our sample (27.64%) that are seemingly unpreserved. The remaining 13.98% of works in the sample were excluded either for being too recent (published in the current year), not being journal articles, or having insufficient date metadata for us to identify the source. Discussion: Our study is limited by design in several ways. Among these are the facts that it uses only a subset of archives, it only tracks articles with DOIs, and it does not account for institutional repository coverage. Nonetheless, as an initial attempt to gauge the landscape, our results will still be of interest to libraries, publishers, and researchers. Conclusion: This work reveals an alarming preservation deficit. Only 0.96% of Crossref members (n = 204) can be confirmed to digitally preserve over 75% of their content in three or more of the archives that we studied. (Note that when, in this article, we write “preserved,” we mean “that we were able to confirm as preserved,” as per the specified limitations of this study.) A slightly larger proportion, i.e., 8.5% (n = 1,797), preserved over 50% of their content in two or more archives. However, many members, i.e., 57.7% (n = 12,257), only met the threshold of having 25% of their material in a single archive. Most worryingly, 32.9% (n = 6,982) of Crossref members seem not to have any adequate digital preservation in place, which is against the recommendations of the Digital Preservation Coalition.https://www.iastatedigitalpress.com/jlsc/article/id/16288/digital preservationpersistent identifiersscholarly communications |
spellingShingle | Martin Paul Eve Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles Journal of Librarianship and Scholarly Communication digital preservation persistent identifiers scholarly communications |
title | Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles |
title_full | Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles |
title_fullStr | Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles |
title_full_unstemmed | Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles |
title_short | Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles |
title_sort | digital scholarly journals are poorly preserved a study of 7 million articles |
topic | digital preservation persistent identifiers scholarly communications |
url | https://www.iastatedigitalpress.com/jlsc/article/id/16288/ |
work_keys_str_mv | AT martinpauleve digitalscholarlyjournalsarepoorlypreservedastudyof7millionarticles |