Tag Archives: software

What are “positives” in format validation?

Articles about JHOVE, such as Good GIF Hunting, grab my attention for obvious reasons. This article talks about false positive and negative results, and got me to thinking: What constitutes a “positive” result in file format validation? There are two ways to look at it:

  1. The default assumption is that the file is of a certain format, perhaps based on its extension, MIME type, or other metadata. The software sets out to see if it violates the format’s requirements. In that case, a positive result is that the file doesn’t conform to the requirements.
  2. The default assumption is that the file is just a collection of bytes. The software matches it against one or more sets of criteria. A positive result is that the file matches one of them.

Continue reading

Libtiff 4.0.9 released

Libtiff 4.0.9 has been released. According to the email announcing it:

A great many security improvements have been implemented by Even Rouault.

Much thanks to OSS Fuzz, team OWL337, Roger Leigh, and of course Even Rouault.

Obligatory reminder: Don’t download from libtiff dot org. It’s many years out of date.

JHOVE webinar

An Open Preservation Foundation webinar, “Putting JHOVE to the acid test: A PDF test-set for well-formedness validation in JHOVE,” will be held on November 21, 10 AM GMT (that’s 11 AM in Central Europe and a ludicrous 5 AM or earlier in the US).
Continue reading

Popular Science on format conversion

Popular Science has an article, “How to convert any file to any format.” The title overreaches, but the article actually isn’t too bad. It’s addressed at the ordinary user, not the file format specialist, so it wouldn’t be appropriate to complain too much that it has more breadth than depth.

It starts by recommending using the application that created the file, and that’s certainly good advice. Even when formats are open standards, an app knows more about how it creates its own files than anyone else does. Its files might have bits of application-specific information.
Continue reading

JHOVE online hack day

My venture into the Techno-Liberty blog didn’t work so well. In fact, I’m getting more views on this blog, in spite of not having posted in months, than I got on my best days on the other blog. So … I’m back.

JHOVE is still doing well too, thanks to excellent work by Carl Wilson and others at the Open Preservation Foundation. There will be an online hack day for JHOVE on April 27. The aim is to find ways to improve JHOVE by improving error reporting, collecting example files, and documenting the preservation impact of JHOVE validation issues. (I think that last one means “Why does McGath’s PDF module suck?” :)

The time listed is 8 AM-8 PM. I asked what time zone that is, and was told it means any and all, from New Zealand the long way around to Hawaii.

Last time I said I’d drop in and didn’t really manage to. This time I won’t make promises, but I’ll try to be around in some form. If nothing else, people can ask me questions about JHOVE in the comments.

A Libtiff mirror

Libtiff is still offline at remotesensing.org, but there’s a mirror of the source available on GitHub. I held off on mentioning it in this blog till Bob Friesenhahn confirmed it’s reliable.

Libtiff goes offline

The Libtiff library, which has been a reference implementation of TIFF for many years, has disappeared from the Internet. It was located at remotesensing.org, a domain whose owner apparently was willing to host it without having any close connection to the project. The domain fell into someone else’s hands, and the content changed completely, breaking all links to Libtiff material. Malice doesn’t seem to be involved; the original owner of remotesensing.org just walked away from the domain or forgot to renew it. Who owns it now is unknown, since it’s registered under a privacy shield.

Originally Libtiff was hosted on libtiff.org, but that fell into the hands of a domain owner with no interest in the project. I don’t know why. It still holds Libtiff code, but it’s many years out of date.

As I’m writing this, people on the Libtiff list are trying to figure out exactly what happened. There’s talk of trying to get libtiff.org back, though that may or may not be possible.

For the moment, there’s no primary source for Libtiff on the Web. I’ll hopefully be able to post more information later.

Link

File format analysis tools for archivists

My article on “File Format Analysis Tools for Archivists” is up on LWN.net.

JHOVE 1.14

The Open Preservation Foundation has just announced JHOVE 1.14. The numbering is a bit odd. Version 1.12 never made it to release, and they seem to have skipped 1.13 entirely.

This includes three new modules: the PNG module, which I wrote on a weekend whim, and GZIP and WARC modules adapted from JHOVE2. The UTF-8 module now supports Unicode 7.0.

The release isn’t showing up yet on the OPF website, but I expect that will happen momentarily.

It’s nice to see that the code which I started working on over a decade ago is still alive and useful. Congratulations and thanks to Carl Wilson, who’s now its principal maintainer!

Designing to the demo is a mistake

A lot of software design clearly aims not at providing the best experience to the user, but at providing the most impressive demo. Apple does this all the time, or at least that’s the only explanation I can think of for their design decisions. Getting people to applaud in amazement doesn’t get loyal customers if the product is terrible in everyday use, though.

My current Garmin car GPS device is a good example of this. To enter an address, you enter first the street number, then the street, and finally the locality and state together. This sounds very natural, much better than my old device where you started with the state and worked down to the street number. The trouble is that when you use the new device, you find that auto-completion is useless.
Continue reading