Tag Archives: software

JHOVE online hack day

My venture into the Techno-Liberty blog didn’t work so well. In fact, I’m getting more views on this blog, in spite of not having posted in months, than I got on my best days on the other blog. So … I’m back.

JHOVE is still doing well too, thanks to excellent work by Carl Wilson and others at the Open Preservation Foundation. There will be an online hack day for JHOVE on April 27. The aim is to find ways to improve JHOVE by improving error reporting, collecting example files, and documenting the preservation impact of JHOVE validation issues. (I think that last one means “Why does McGath’s PDF module suck?” :)

The time listed is 8 AM-8 PM. I asked what time zone that is, and was told it means any and all, from New Zealand the long way around to Hawaii.

Last time I said I’d drop in and didn’t really manage to. This time I won’t make promises, but I’ll try to be around in some form. If nothing else, people can ask me questions about JHOVE in the comments.

A Libtiff mirror

Libtiff is still offline at remotesensing.org, but there’s a mirror of the source available on GitHub. I held off on mentioning it in this blog till Bob Friesenhahn confirmed it’s reliable.

Libtiff goes offline

The Libtiff library, which has been a reference implementation of TIFF for many years, has disappeared from the Internet. It was located at remotesensing.org, a domain whose owner apparently was willing to host it without having any close connection to the project. The domain fell into someone else’s hands, and the content changed completely, breaking all links to Libtiff material. Malice doesn’t seem to be involved; the original owner of remotesensing.org just walked away from the domain or forgot to renew it. Who owns it now is unknown, since it’s registered under a privacy shield.

Originally Libtiff was hosted on libtiff.org, but that fell into the hands of a domain owner with no interest in the project. I don’t know why. It still holds Libtiff code, but it’s many years out of date.

As I’m writing this, people on the Libtiff list are trying to figure out exactly what happened. There’s talk of trying to get libtiff.org back, though that may or may not be possible.

For the moment, there’s no primary source for Libtiff on the Web. I’ll hopefully be able to post more information later.

Link

File format analysis tools for archivists

My article on “File Format Analysis Tools for Archivists” is up on LWN.net.

JHOVE 1.14

The Open Preservation Foundation has just announced JHOVE 1.14. The numbering is a bit odd. Version 1.12 never made it to release, and they seem to have skipped 1.13 entirely.

This includes three new modules: the PNG module, which I wrote on a weekend whim, and GZIP and WARC modules adapted from JHOVE2. The UTF-8 module now supports Unicode 7.0.

The release isn’t showing up yet on the OPF website, but I expect that will happen momentarily.

It’s nice to see that the code which I started working on over a decade ago is still alive and useful. Congratulations and thanks to Carl Wilson, who’s now its principal maintainer!

Designing to the demo is a mistake

A lot of software design clearly aims not at providing the best experience to the user, but at providing the most impressive demo. Apple does this all the time, or at least that’s the only explanation I can think of for their design decisions. Getting people to applaud in amazement doesn’t get loyal customers if the product is terrible in everyday use, though.

My current Garmin car GPS device is a good example of this. To enter an address, you enter first the street number, then the street, and finally the locality and state together. This sounds very natural, much better than my old device where you started with the state and worked down to the street number. The trouble is that when you use the new device, you find that auto-completion is useless.
Continue reading

Update on JHOVE

I Aten't DeadI’ve received an email reply from Becky McGuiness at Open Preservation Foundation to my query about JHOVE’s status. She says that VeraPDF has been taking all the development resources, as I suspected, but that work on JHOVE (in particular, fixing the expired installer) will resume soon.

Update: Here’s a response from Carl Wilson at OPF on the status of JHOVE. It says that the next version will jump from 1.12 to 1.14 (triskaidekaphobia?) and will include several new modules, including my PNG module.

I’ll second Carl’s call for institutions to become OPF supporters. As someone on Twitter said recently, open source software is “free, as in kittens.” It costs money to maintain it. Occasionally people support free software for the sheer love of it, but developers do need to earn a living.

Update 2: OPF reports that JHOVE installer has been fixed.

Want FLAC on your Mac? Try Vox

Vox application windowiTunes is horrible and keeps getting worse. The current version has come down with dyslexia; it can’t even play my files in order. On top of that, it supports a poor range of file formats, knowing nothing about popular open formats like FLAC and Ogg Vorbis. QuickTime Player has a saner user interface but the same format limitations. If you want to play music in those formats, you need to look for other software. I’ve just grabbed Vox for OS X, and it handles those files nicely.

It’s not an iTunes replacement, even if all you want to do is play music that’s stored on your computer. You can import your iTunes library, but you can’t view the contents of your playlists (which it calls “collections”) or select items from them. What it does let you do, though, is play FLAC, AAC, ALAC (Apple Lossless), Ogg, MP3, and APE files.
Continue reading

Is JHOVE dead in the water again?

See this post for important updates.

JHOVE logoIn December, JHOVE 12.0 was very close to a release. Since then, next to nothing has happened. The installer for the beta version expired, and there’s been an update for that. A couple of pull requests have been merged. Otherwise — nothing.

I think what’s happened is that the Open Preservation Foundation’s very limited resources were pulled onto VeraPDF. That’s certainly a worthwhile endeavor, but it irks me that I handed support of JHOVE over to OPF only to see the ball dropped. I did some work on a PNG module a month ago and submitted a pull request; nothing’s happened since then.

I wouldn’t mind picking JHOVE up agin, but I’m going to be blunt about this: I’m done with working on it for free. If institutions that want JHOVE to be maintained really care about it, they should put up some money, whether it’s to OPF, to me, or to someone else. Open source software isn’t something that magically happens because people love to work without pay.

The Java file format API graveyard

If you look for Java libraries to support specific file formats, you’ll soon come upon the gloomy graveyard of Java APIs. Sun and Oracle have a history of devising nice packages for reading and writing different kinds of files, only to abandon their maintenance. You can still find pages for them, and it takes a close look to figure out that they aren’t supported any more.

Java Advanced Imaging (JAI) was nice in its time. It still has a page on Oracle’s website, but the latest “what’s new” item is dated 2007. The page brags about customer success stories as if it were still usable code. I’ve tried working with it. It’s out of sync with the current com.sun classes, and I got only limited use out of it. In its time it was a good way to read and write image files.

Java Media Framework (JMF) runs on a 166 MHz Pentium or 160 MHz PowerPC. The downloaded jars are dated May 1, 2003. It had a nice list of supported formats.

If you’re working with audio files, javax.sound looks more encouraging. Its API is listed with Java 8. The class java.sound.sampled.AudioSystem supports reading and writing of audio files. I can’t find a list of the supported formats.

Java does reliably support some formats. Its handling of text encodings is versatile, and java.util.zip handles ZIP and GZIP.

Third-party code can come to the rescue. For reading and writing PDF, Apache PDFBox looks like the best bet. You can use Apache Tika with lots of formats, if you just need to extract metadata. Another alternative is to use ImageMagick, but it runs natively rather than under the JVM, so you have to invoke it with exec calls. im4java and JMagick can save some of the tedium. There are open source Java libraries for reading and writing specific file formats. Some may be good, some not.

If you need to deal with the guts of file formats in Java, you’ll usually have to find some good third-party code or start writing your own.