The story of JHOVE2 is a rather sad one, but I need to include it in this series. As the name suggests, it was supposed to be the next generation of JHOVE. Stephen Abrams, the creator of JHOVE (I only implemented the code), was still at Harvard, and so was I. I would have enjoyed working on it, getting things right that the first version got wrong. However, Stephen accepted a position with the California Digital Library (CDL), and that put an end to Harvard’s participation in the project. I thought about applying for a position in California but decided I didn’t want to move west. I was on the advisory board but didn’t really do much, and I had no involvement in the programming. I’m not saying I could have written JHOVE2 better, just explaining my relationship to the project.
The institutions that did work on it were CDL, Portico, and Stanford University. There were two problems with the project. The big one was insufficient funding; the money ran out before JHOVE2 could boast a set of modules comparable to JHOVE. A secondary problem was usability. It’s complex and difficult to work with. I think if I’d been working on the project, I could have helped to mitigate this. I did, after all, add a GUI to JHOVE when Stephen wasn’t looking.
JHOVE has some problems that needed fixing. It quits its analysis on the first error. It’s unforgiving on identification; a TIFF file with a validation error simply isn’t a TIFF file, as far as it’s concerned. Its architecture doesn’t readily accommodate multi-file documents. It deals with embedded formats only on a special-case basis (e.g., Exif metadata in non-TIFF files). Its profile identification is an afterthought. JHOVE2 provided better ways to deal with these issues. The developers wrote it from scratch, and it didn’t aim for any kind of compatibility with JHOVE.
Continue reading
Getting JHOVE2 to build
There’s a private beta, which should soon be public, of a digital preservation area on StackExchange.com. I took advantage of my invitation to it to ask about something that had stalled me a while ago when I tried to download and build JHOVE2. A quick reply told me that the needed change is simple, just one line in the pom.xml file. I can’t link to my question and the answer on Stack Exchange, since a login is required to view it, but it turns out this issue had already been brought up in a JHOVE2 ticket. The discussion indicates some confusion about whether the issue has been fixed in the main JHOVE2 repository, but Sheila Morrissey has a fork on Bitbucket with the fix.
The fix is to change the URL for “JBoss Repository” in pom.xml to the following:
<url>https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/</url >
Kevin Clarke, who provided the answer, recommends building with the following command line to avoid error messages in the tests:
mvn -DskipTests=true install
Comments Off on Getting JHOVE2 to build
Posted in commentary
Tagged jhove2, software