Tag Archives: JHOVE

Secrets of building JHOVE2

The current beta of JHOVE2 is rather tricky to build. With some help from Marisa Strong, I’ve managed to do it. Here’s a guide which may be helpful.

1. Download JHOVE2. If you have Mercurial, follow the instructions. Otherwise use the “Get Source” menu item to get the .gz file.

2. Get a current version of Maven if you don’t have one.

3. If got the gzip file, expand it and the tarball which it contains. This will create a main directory.

4. cd main. The first recommendation is to run mv compile, but this apparently requires an environment which isn’t released yet, so instead do

mvn assembly:assembly -DskipTests

5. cd into the target directory. This will have the file jhove2-2.0.0.zip. Unzip this in place.

6. The directory jhove2-2.0.0 was just created. cd into it. This contains the script jhove2.sh. Run this from the command line with no arguments, and you’ll get a usage message if everything worked correctly.

To do stuff with JHOVE2, the user guide (PDF) is helpful.

JHOVE2 goes to beta

The JHOVE2 team has announced a beta release:

This beta code release supports all the major technical objectives of the project, including a more sophisticated, modular architecture; signature-based file identification; policy-based assessment of objects; recursive characterization of objects comprising aggregate files and files arbitrarily nested in containers; and extensive configuration and reporting options. The release also continues to fill out the roster of supported formats, with modules for ICC color profiles, SGML, Shapefile, TIFF, UTF-8, WAVE, and XML.

The source code page provides the source as a Mercurial repository, or as a single download. The gzip download expands into a file called main-14e8a6102f63 and it isn’t at all obvious what to do with it. Chmoding it to an executable and running it doesn’t work. I’ve asked what this is supposed to be; I’ll update this post when I get a response.

Update: That’s a tarball. Adding the .tar extension and using tar -xvf works nicely.

JHOVE2 tutorial at iPRES 2010

Percy Willett has announced:

The JHOVE2 project team is holding a full day tutorial on the use of JHOVE2 on September 19, 2010, in conjunction with the iPRES 2010 conference in Vienna, Austria.

 The main topics covered during the tutorial will be:

  • The role of characterization in digital curation and preservation workflows.
  • An overview of the JHOVE2 project: requirements, methodology, and deliverables.
  • Demonstration of the JHOVE2 application.
  • Architectural review of the JHOVE2 framework and Java APIs.
  • Integration of JHOVE2 technology into existing or planned systems, services, and workflows.
  • Third-party development of conformant JHOVE2 modules.
  • Building and sustaining the JHOVE2 user community.

This tutorial is an updated and expanded version of the workshop presented at iPRES 2009 in San Francisco. This tutorial will closely follow the production release of JHOVE2 and will incorporate significant new material arising from the second year of project work.
 
The targeted audience for the tutorial includes digital curation, preservation, and repository managers, analysts, tool users and developers, and other practitioners and technologists whose work is dependent on an understanding of the format and pertinent characteristics of digital assets.

 For more information on JHOVE2, see the project wiki at: http://jhove2.org

 For more information on iPRES 2010, and to register for the workshop and conference, see the conference website: http://www.ifs.tuwien.ac.at/dp/ipres2010/

JHOVE2 poll

There is a poll online for letting the developers of JHOVE2 know what plans you have for it. It just takes a couple of minutes to fill out and doesn’t even require Javascript.

New JHOVE2 alpha release v. 0.60

Forwarded from Stephen Abrams:

A new alpha release of JHOVE2 is now available for download and evaluation (v. 0.6.0, 2010-03-17). Distribution packages (in zip and tar.gz form) are available on the JHOVE2 public wiki at https://confluence.ucop.edu/display/JHOVE2Info/Jhove2-0.6.0+Download.

The new JHOVE2 architecture reflected in this prototype is described in the architectural overview.

The distribution package contains two driver scripts in the JHOVE2 home directory: a DOS shell script (jhove2.bat) for Windows and a Bourne shell script (jhove2.sh) for Unix/Linux. Please see the download page for instructions on any modifications that need to be made to these scripts to run in your environment.

You can verify the installation with the command (for Unix):


      % ./jhove2.sh test.xml -o test.xml.out

This command should produce results similar to this.

The prototype supports the following features:

  • Format identification, validation, feature extraction, and message digest.
  • Appropriate recursive processing of directories, file sets, clumps, and container files (see the architectural overview for the definition of file sets and clumps).
  • High performance buffered I/O using the Java NIO package.
  • Integration with DROID for file identification.
  • Message digesting for the following algorithms: Adler-32, CRC-32, MD2, MD5, SHA-1, SHA-256, SHA-384, SHA-512.
  • Results formatted as text (name/value pairs), JSON, and XML.
  • Use of the Spring Framework v2.5.6.
  • Inversion-of-Control (IOC) container for flexible application and module configuration using dependency injection.
  • Complete modules:

Please be aware of the following limitations and caveats:

  • JHOVE2 requires a 1.6 JRE.
  • This prototype is being made available to provide an early look at the new JHOVE2 architecture and APIs. While the full processing model is demonstrated, there is limited format support at this time.
  • The aggregate-level identification module (i.e. the “aggrefier” module) has been configured by the Spring configuration files in this distribution to recognize a Shapefile formed by the files with the extensions “.shp”, “.shx”, and “.dbf”. The Shapefile module itself, however, is minimally functional.
  • There is no assessment module available for review at this time.

The project team is now working on additional format modules. These will be added to the public distribution as they become available.

Utility scripts are also included in the JHOVE2 installation directory to support Windows (.bat) and Unix/Linux (.sh):

  • jhove2_doc – JHOVE2 Reportable documentation utility.
  • jhove2_upfg – JHOVE2 utility to generate editable Java
    properties file for units of measure settings for Reportable features
    that have a Numeric type

  • jhove2_dpfg – JHOVE2 utility to generate editable Java
    properties file for Displayer settings for Reportable features

Please see the download page for instructions on running these scripts in your environment.

We would very much like to receive your feedback on the new code. While the current state of the code is the product of much internal review and refactoring, your evaluations and suggestions, based on a wide diversity of experience and needs, will be welcome as we continue to move forward with our work.

Please direct your comments and suggestions to the “JHOVE2-TechTalk-L” mailing list for community discussion.

Thank you,

Stephen Abrams / California Digital Library
Tom Cramer / Stanford University
Sheila Morrissey / Portico
On behalf of the JHOVE2 project team

JHOVE 1.5 — oops!

Argh! I always forget something in a JHOVE build, and carefully checking all the nitpicking things just means I forget the important ones.

The JHOVE 1.5 which I uploaded to SourceForge a few days ago had all the right sources, release notes, checksums, etc. … but it didn’t have up-to-date JAR files, which kind of defeats the whole point!!

This is now fixed. If you’ve already downloaded it, please download it again. Check your download against the corresponding MD5 file to be sure.

A happy holiday-of-your-choice to all!

JHOVE 1.5

JHOVE 1.5 is now out, and so far no one’s complained of anything missing. If you notice any problems, please comment.

Thanks to Thomas Ledoux, JHOVE now has an option to output TextMD metadata. There are minor bug fixes for PDF and UTF-8. Full details are in the release notes.

JHOVE2 at iPres

Unfortunately, I wasn’t in California for the post-iPres workshop on JHOVE2, but there is some information online. The JHOVE2 project presentations page includes a short and a long version of the slides. An early version of the code has been made available for testing and progress continues.

Catching up

Here are a few of the news items I mentioned recently on the old blog, for your convenience:

  • A workshop on JHOVE2 will be held after the conclusion of iPres 2009 in San Francisco, on October 7, 2009. This will include, for the first time, a presentation of the prototype code.
  • JPEG XR, formerly known as Microsoft HD Photo, is now an international standard, as reported in a JPEG press release.
  • JHOVE 1.4 is now available on SourceForge. The main change is that PDF/A compliance is more accurately identified than before, and is based on the final standard rather than a draft.