HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign

In March 2013, the University of Illinois at Urbana-Champaign Library adopted a policy to more closely integrate the HathiTrust Digital Library into its own infrastructure for digital collections. Specifically, the Library decided that the HathiTrust Digital Library would serve as a trusted reposit...

Full description

Bibliographic Details
Main Authors: Kyle R. Rimkus, Kirk M. Hess
Format: Article
Language:English
Published: Code4Lib 2014-07-01
Series:Code4Lib Journal
Online Access:http://journal.code4lib.org/articles/9703
_version_ 1811310163399802880
author Kyle R. Rimkus
Kirk M. Hess
author_facet Kyle R. Rimkus
Kirk M. Hess
author_sort Kyle R. Rimkus
collection DOAJ
description In March 2013, the University of Illinois at Urbana-Champaign Library adopted a policy to more closely integrate the HathiTrust Digital Library into its own infrastructure for digital collections. Specifically, the Library decided that the HathiTrust Digital Library would serve as a trusted repository for many of the library’s digitized book collections, a strategy that favors relying on HathiTrust over locally managed access solutions whenever this is feasible. This article details the thinking behind this policy, as well as the challenges of its implementation, focusing primarily on technical solutions for “remediating” hundreds of thousands of image files to bring them in line with HathiTrust’s strict specifications for deposit. This involved implementing HTFeed, a Perl 5 application developed at the University of Michigan for packaging content for ingest into Hathi Trust, and its many helper applications (JHOVE to detect metadata problems, Exiftool to detect metadata issues and repair missing image metadata, and Kakadu to create JP2000 files), as well as a file format conversion process using ImageMagick. Today, Illinois has over 1600 locally managed volumes queued for ingest, and has submitted over 2300 publicly available titles to the HathiTrust Digital Library.
first_indexed 2024-04-13T09:54:34Z
format Article
id doaj.art-3111a8738f9a47f1beea5b2b9402fc27
institution Directory Open Access Journal
issn 1940-5758
language English
last_indexed 2024-04-13T09:54:34Z
publishDate 2014-07-01
publisher Code4Lib
record_format Article
series Code4Lib Journal
spelling doaj.art-3111a8738f9a47f1beea5b2b9402fc272022-12-22T02:51:28ZengCode4LibCode4Lib Journal1940-57582014-07-01259703HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-ChampaignKyle R. RimkusKirk M. HessIn March 2013, the University of Illinois at Urbana-Champaign Library adopted a policy to more closely integrate the HathiTrust Digital Library into its own infrastructure for digital collections. Specifically, the Library decided that the HathiTrust Digital Library would serve as a trusted repository for many of the library’s digitized book collections, a strategy that favors relying on HathiTrust over locally managed access solutions whenever this is feasible. This article details the thinking behind this policy, as well as the challenges of its implementation, focusing primarily on technical solutions for “remediating” hundreds of thousands of image files to bring them in line with HathiTrust’s strict specifications for deposit. This involved implementing HTFeed, a Perl 5 application developed at the University of Michigan for packaging content for ingest into Hathi Trust, and its many helper applications (JHOVE to detect metadata problems, Exiftool to detect metadata issues and repair missing image metadata, and Kakadu to create JP2000 files), as well as a file format conversion process using ImageMagick. Today, Illinois has over 1600 locally managed volumes queued for ingest, and has submitted over 2300 publicly available titles to the HathiTrust Digital Library.http://journal.code4lib.org/articles/9703
spellingShingle Kyle R. Rimkus
Kirk M. Hess
HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign
Code4Lib Journal
title HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign
title_full HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign
title_fullStr HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign
title_full_unstemmed HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign
title_short HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign
title_sort hathitrust ingest of locally managed content a case study from the university of illinois at urbana champaign
url http://journal.code4lib.org/articles/9703
work_keys_str_mv AT kylerrimkus hathitrustingestoflocallymanagedcontentacasestudyfromtheuniversityofillinoisaturbanachampaign
AT kirkmhess hathitrustingestoflocallymanagedcontentacasestudyfromtheuniversityofillinoisaturbanachampaign