Monthly Archives: February 2016


A few days ago, I started writing a PNG module for JHOVE, partly to keep my Java skills up, partly to help me understand the PNG format. After a while I noticed there already is code for a PNG module and has been for a long time. I must have added it to SourceForge. According to a note in the code, Gian Uberto Lauri at Engineering Ingengeria Informatica S.p.a. created it in 2006. A good amount of work clearly went into it, but it won’t compile. It’s located in a non-source code directory (extramodules/it/eng/jhove/module/png/, so I had to copy it to src/java to try it out.
Continue reading


The UK National Archive quietly released DROID 6.2 this month. I noticed only because of some mentions on Twitter. The file dates indicate the update was released on February 16. Here’s the new portion of the changelog:
Continue reading

3D PDF and PDF/E

It must be a surprise to most people, but you can represent three-dimensional objects in PDF, in spite of its strictly 2-dimensional imaging model. It turns out there are two ways to do it, with the older U3D and the more modern PRC. What makes them possible is PDF’s annotation feature, which allows capabilities to be added to PDF, and the Acrobat 3D API. Full support of these features requires implementation of at least PDF 1.7 Extension Level 1, or to put it in application terms, Acrobat 8.1.

The PDF/E standard for engineering documents, aka ISO 24517, includes U3D but not PRC. A PDF/E-2 standard is currently in development and is expected to include PRC. PDF/E, like the other slashes of PDF, is a subset of the PDF standard (version 1.6), so obviously it’s possible to do 3D work without reference to it. It’s intended for cases where long-term retention or archiving is important. This suggests some affinity with PDF/A, which is specifically aimed at archive-quality documents, and the PDF Association, which is heavily involved in PDF/A, has recently started a PDF/E Competence Center. Oddly, the competence center says that PDF/E-1 “does not address 3D,” though other sources say PDF/E does reference U3D. Perhaps this is a matter of what really constitutes “addressing” 3D as opposed to just acknowledging it.


Solving file mysteries with ExifTool

Here’s a new YouTube video of mine illustrating some ExifTool techniques for figuring out why files behave strangely. It also serves as a teaser for my new course on ExifTool.
Continue reading

Upcoming book on digital preservation

Thumbnail cover of When We Are No MoreAlmost all the published books on digital preservation are academic writing for a very limited audience. My own Files that Last wasn’t intended for a tiny audience but ended up that way. The chances look better for Abby Smith Rumsey’s upcoming When We Are No More: How Digital Memory Is Shaping Our Future.
Continue reading

A new video course on ExifTool

ExifTool course thumbnailMy latest Udemy course, Managing Metadata with ExifTool, is now live! The list price is $36, but with the link here it’s just $12 through the end of February. OK, to tell the truth, Udemy’s payment structure practically mandates setting high prices and then discounting heavily, but this introductory rate is the best one you’ll see for a while.

As far as I can tell, this is the only publicly available tutorial on ExifTool which covers the topic so thoroughly. Here’s the outline:
Continue reading

Billion-year storage?

What would you say about data storage with a lifetime of billions of years? I’d say that extraordinary claims require extraordinary support. The University of Southampton’s Optoelectronics Research Center says it’s developed digital storage that will last for 13.8 billion years at 190° C — or at least that’s how it came out in the report. Peter Kazansky says “we have created the first document which will likely survive the human race.” (And the death of the Sun?)
Continue reading

Religious authoritarianism vs. emoji

This post may be illegal in Indonesia. It includes the code point sequence U+1F468‍ U+200D U+2764️ U+FE0F U+200D U+1F48B‍ U+200D U+1F468, which renders as the emoji 👨‍❤️‍💋‍👨 or “man kissing man.” According to a Time article, the Indonesian Ministry of Communication and Informatics is “asking” Facebook to block the use of “gay” emoji. Failure to comply could mean the Negative Content Management Panel (George Orwell would have been impressed!) will block Facebook in Indonesia.

Emoji have generated several controversies already, but this is the first I’ve heard of a government censoring code points. It’s couched in terms of “sensitivity,” “respect,” and protecting children.

The PDF search problem

An article from the PDF Association points out the pitfalls in searching PDF documents. Even if a document has actual text in it, rather than being a scanned image, it might not hold the text in the natural character ordering. PDF is a format for rendering a document’s visible appearance, and it isn’t so good at holding semantic content. Chunks of text can be stored out of sequence as long as they render in the right place.
Continue reading

Preservica’s “eX-Files”

Preservica has introduced a catchy campaign warning people against letting their files become “eX-files.” It includes a downloadable PDF, “Safeguarding your Vital Long-term Electronics Records.” (You have to provide your email address to download it, and you may have to turn off ad blocking temporarily to see the form where you do that.)
Continue reading