The Open Preservation Foundation has just announced JHOVE 1.14. The numbering is a bit odd. Version 1.12 never made it to release, and they seem to have skipped 1.13 entirely.
This includes three new modules: the PNG module, which I wrote on a weekend whim, and GZIP and WARC modules adapted from JHOVE2. The UTF-8 module now supports Unicode 7.0.
The release isn’t showing up yet on the OPF website, but I expect that will happen momentarily.
It’s nice to see that the code which I started working on over a decade ago is still alive and useful. Congratulations and thanks to Carl Wilson, who’s now its principal maintainer!
The Unified Digital Format Registry (UDFR), created and maintained by the California Digital Library, will shut down on April 15, 2016. I don’t know whether the whole site will go away or just the ability to query the registry.
Information Standards Quarterly has an article on UDFR by Andrea Goethals. The source code repository is on GitHub.
The predecessor project, GDFR, never got to publicly usable status. The site gdfr.info still responds to pings, but apparently not to HTTP requests.
Quoting its description here, so it’s saved in at least one place if the site completely goes away:
The UDFR is a reliable, publicly accessible, and sustainable knowledge base of file format representation information for use by the digital preservation community.
A format is a set of semantic and syntactic rules governing the mapping between abstract information and its representation in digital form. While many worthwhile and necessary preservation activities can be performed on a digital asset without knowledge of its format, that is, merely as a sequence of bits, any higher-level preservation of the underlying information content must be performed in the context of the asset’s format.
The UDFR seeks to “unify” the function and holdings of two existing registries, PRONOM and GDFR (the Global Digital Format Registry), in an open source, semantically enabled, and community supported platform.
The UDFR was developed by the University of California Curation Center (UC3) at the California Digital Library (CDL), funded by the Library of Congress as part of its National Digital Information Infrastructure Preservation Program (NDIIPP). The service is implemented on top of the OntoWiki semantic wiki and Virtuoso triple store.
Posted in News
Tagged preservation, UDFR
Almost all the published books on digital preservation are academic writing for a very limited audience. My own Files that Last wasn’t intended for a tiny audience but ended up that way. The chances look better for Abby Smith Rumsey’s upcoming When We Are No More: How Digital Memory Is Shaping Our Future.
What would you say about data storage with a lifetime of billions of years? I’d say that extraordinary claims require extraordinary support. The University of Southampton’s Optoelectronics Research Center says it’s developed digital storage that will last for 13.8 billion years at 190° C — or at least that’s how it came out in the report. Peter Kazansky says “we have created the first document which will likely survive the human race.” (And the death of the Sun?)
Preservica has introduced a catchy campaign warning people against letting their files become “eX-files.” It includes a downloadable PDF, “Safeguarding your Vital Long-term Electronics Records.” (You have to provide your email address to download it, and you may have to turn off ad blocking temporarily to see the form where you do that.)
Today I came across a video from the Library of Congress on “Why digital preservation is important for you.” Anyone following its advice will certainly have a better chance of keeping their files alive and organized for a long time. The only question is: Who’s going to follow that advice?
The British Library’s Digital Preservation Team has issued a report on WAV Format Preservation Assessment. It cites the broad adoption of WAV and its extension BWF (Broadcast Wave Format) as a positive for preservation purposes and offers only a few cautions. I’m flattered by the recommendation, “Wherever possible and appropriate to the workflow, submitted content should be validated using JHOVE.”
The nineties saw huge changes in personal computing, as operating systems became more complex, Internet connections became common, and the World Wide Web appeared. This meant a lot of instability as formats came and went.
This past weekend I discovered a CD-ROM in my closet with the production files for a small-run songbook, The Pegasus Winners (optimistically called “Volume 1”), that I produced in 1994. The good news is that the CD is still readable. The bad news is that I can’t read most of the files. The not-so-bad news is that I could probably recover them with moderate effort.
My second Udemy course, Personal Digital Preservation, is now available! The regular price for enrolling is $16, but for readers of this blog (and anyone else you want to tell!) it’s just $10 with the coupon code DATALITH10. That code is good through the end of February.
My next video course on Udemy will be (Udemy willing, which I think they will be) “Personal Digital Preservation: Keeping Your Files Safe and Usable.” Unlike my previous course on File Format Identification Tools, this one will be aimed at a broad audience: anyone who has a lot of files and wants to keep them usable for years to come. I’ll be covering three main areas: avoiding file loss, recovering files, and keeping files usable and understandable. The price will be $16, which will include about an hour of lectures as well as reference PDF files, but I’ll post a coupon code here to get it for less.
There’s still work to be done, including the approval process. It will appear as soon as it’s approved, so I can’t tell you an exact date, but I’m targeting January 12.