HathiTrust Ingest of Locally Managed Content: A Case Study from the University of Illinois at Urbana-Champaign

In March 2013, the University of Illinois at Urbana-Champaign Library adopted a policy to more closely integrate the HathiTrust Digital Library into its own infrastructure for digital collections. Specifically, the Library decided that the HathiTrust Digital Library would serve as a trusted reposit...

Full description

Bibliographic Details
Main Authors: Kyle R. Rimkus, Kirk M. Hess
Format: Article
Language:English
Published: Code4Lib 2014-07-01
Series:Code4Lib Journal
Online Access:http://journal.code4lib.org/articles/9703
Description
Summary:In March 2013, the University of Illinois at Urbana-Champaign Library adopted a policy to more closely integrate the HathiTrust Digital Library into its own infrastructure for digital collections. Specifically, the Library decided that the HathiTrust Digital Library would serve as a trusted repository for many of the library’s digitized book collections, a strategy that favors relying on HathiTrust over locally managed access solutions whenever this is feasible. This article details the thinking behind this policy, as well as the challenges of its implementation, focusing primarily on technical solutions for “remediating” hundreds of thousands of image files to bring them in line with HathiTrust’s strict specifications for deposit. This involved implementing HTFeed, a Perl 5 application developed at the University of Michigan for packaging content for ingest into Hathi Trust, and its many helper applications (JHOVE to detect metadata problems, Exiftool to detect metadata issues and repair missing image metadata, and Kakadu to create JP2000 files), as well as a file format conversion process using ImageMagick. Today, Illinois has over 1600 locally managed volumes queued for ingest, and has submitted over 2300 publicly available titles to the HathiTrust Digital Library.
ISSN:1940-5758