Tag Archives: preservation

UDFR job openings

I’ve been informed that there are two new contract openings at the Universal Digital Format Registry (UDFR), for a project developer and a project architect. I’d be tempted myself if it didn’t mean moving to California.

The California Digital Library should have the job announcements on line shortly, though it doesn’t as I write this.

Preservation week

The American Library Association has announced Preservation Week, May 9-15, 2010. Announced events so far concentrate mostly on preservation of physical materials, but I’m hoping digital preservation has a prominent role as well.

iPres 2009 proceedings available

The proceedings from iPres 2009 are now available online. Of particular interest in the area of file formats is “MIXED: Repository of Durable File Format Conversion.”

Thanks to Digitization 101 for the link.

New JHOVE2 alpha release v. 0.60

Forwarded from Stephen Abrams:

A new alpha release of JHOVE2 is now available for download and evaluation (v. 0.6.0, 2010-03-17). Distribution packages (in zip and tar.gz form) are available on the JHOVE2 public wiki at https://confluence.ucop.edu/display/JHOVE2Info/Jhove2-0.6.0+Download.

The new JHOVE2 architecture reflected in this prototype is described in the architectural overview.

The distribution package contains two driver scripts in the JHOVE2 home directory: a DOS shell script (jhove2.bat) for Windows and a Bourne shell script (jhove2.sh) for Unix/Linux. Please see the download page for instructions on any modifications that need to be made to these scripts to run in your environment.

You can verify the installation with the command (for Unix):


      % ./jhove2.sh test.xml -o test.xml.out

This command should produce results similar to this.

The prototype supports the following features:

  • Format identification, validation, feature extraction, and message digest.
  • Appropriate recursive processing of directories, file sets, clumps, and container files (see the architectural overview for the definition of file sets and clumps).
  • High performance buffered I/O using the Java NIO package.
  • Integration with DROID for file identification.
  • Message digesting for the following algorithms: Adler-32, CRC-32, MD2, MD5, SHA-1, SHA-256, SHA-384, SHA-512.
  • Results formatted as text (name/value pairs), JSON, and XML.
  • Use of the Spring Framework v2.5.6.
  • Inversion-of-Control (IOC) container for flexible application and module configuration using dependency injection.
  • Complete modules:

Please be aware of the following limitations and caveats:

  • JHOVE2 requires a 1.6 JRE.
  • This prototype is being made available to provide an early look at the new JHOVE2 architecture and APIs. While the full processing model is demonstrated, there is limited format support at this time.
  • The aggregate-level identification module (i.e. the “aggrefier” module) has been configured by the Spring configuration files in this distribution to recognize a Shapefile formed by the files with the extensions “.shp”, “.shx”, and “.dbf”. The Shapefile module itself, however, is minimally functional.
  • There is no assessment module available for review at this time.

The project team is now working on additional format modules. These will be added to the public distribution as they become available.

Utility scripts are also included in the JHOVE2 installation directory to support Windows (.bat) and Unix/Linux (.sh):

  • jhove2_doc – JHOVE2 Reportable documentation utility.
  • jhove2_upfg – JHOVE2 utility to generate editable Java
    properties file for units of measure settings for Reportable features
    that have a Numeric type

  • jhove2_dpfg – JHOVE2 utility to generate editable Java
    properties file for Displayer settings for Reportable features

Please see the download page for instructions on running these scripts in your environment.

We would very much like to receive your feedback on the new code. While the current state of the code is the product of much internal review and refactoring, your evaluations and suggestions, based on a wide diversity of experience and needs, will be welcome as we continue to move forward with our work.

Please direct your comments and suggestions to the “JHOVE2-TechTalk-L” mailing list for community discussion.

Thank you,

Stephen Abrams / California Digital Library
Tom Cramer / Stanford University
Sheila Morrissey / Portico
On behalf of the JHOVE2 project team

PDF/A Seminar in Washington

A seminar on PDF/A will be held in Washington, DC, on March 26. The registration fee is $125. PDF/A is a restricted subset of PDF designed to promote long-term data viability for the purpose of preservation.

The press release contains a bizarre statement:

“At this time, the use of PDF/A is not mandatory in the United States,” said Betsy Fanning, Director, Standards and Member Services, AIIM, “however, that is changing.” “We are learning of draft legislation that is being debated that will make the use of PDF/A mandatory for preserving electronic documents.”

Congress has neither the right nor the technical competence to order us to use particular file formats. Hopefully this was an out-of-context quote about the government’s own use of PDF/A, though even there legislation requiring a specific subset of a specific format would be very strange.

iPRES 2010 call for papers

iPRES 2010 (September 19-24, Vienna) has issued a call for papers. Submissions are due by May 5, and final versions by July 11.

FITS user guide

There’s now a user guide online for Harvard University Libraries’ File Information Tool Set (FITS). FITS extracts technical metadata using serveral different tools, including JHOVE, Exiftool, NLNZ Metadata Extractor, DROID, FFIdent, and File Utility.

Does anyone reading this know if FFIdent is still alive somewhere on the Web? A web search for it turns up nothing useful, and the number 1 hit is the FITS site itself.

PASIG in Boston

I’ll be at the Sun PASIG (Preservation and Archives SIG) at Northeastern University tomorrow.

HUL announces new deputy director

Robert Darnton has announced the appointment of Helen Shenton as deputy director of the Harvard University Libraries. She comes from the British Library and has a strong background in digital preservation. I’m particularly intrigued that she “masterminded the creation of the high-density, low-oxygen robotic depository of the BL at Boston Spa” (the other Boston).

ECA 2010

By way of Digitization 101: ECA 2010, the 8th European Conference on Digital Archiving, will be held in Geneva, Switzerland, on April 28-30, 2010. The announcement is in German; here’s a quick translation.

From April 28 through 30, 2010, the European Converence on Digital Archiving will take place in Geneva. This stands in the tradition of European archiving conferences of the last decade. With the accent on the digital, and archiving as a function rather than the archive as an institution, the conference will set new priorities. The future will be digital; we will maintain the analog tradition; the archive of the future must have a safe refuge for the analog and digital trails of the past. That is our responsibility.

 
We are sure that you can expect an attractive and rich conference program.

I know German, but not natively, so I offer my apologies for any clumsiness and mixed metaphors.