See this post for important updates.
In December, JHOVE 12.0 was very close to a release. Since then, next to nothing has happened. The installer for the beta version expired, and there’s been an update for that. A couple of pull requests have been merged. Otherwise — nothing.
I think what’s happened is that the Open Preservation Foundation’s very limited resources were pulled onto VeraPDF. That’s certainly a worthwhile endeavor, but it irks me that I handed support of JHOVE over to OPF only to see the ball dropped. I did some work on a PNG module a month ago and submitted a pull request; nothing’s happened since then.
I wouldn’t mind picking JHOVE up agin, but I’m going to be blunt about this: I’m done with working on it for free. If institutions that want JHOVE to be maintained really care about it, they should put up some money, whether it’s to OPF, to me, or to someone else. Open source software isn’t something that magically happens because people love to work without pay.
There’s now a JHOVE PNG module on my GitHub site. The relevant new classes are
com.mcgath.jhove.module.PngModule and everything in the package
com.mcgath.jhove.module.png. I could have continued from Lauri’s code as I mentioned in my previous post, but I like a more factored approach, so I continued with my own code, which has a separate class for each chunk type. Take a look at the top-level file FORKNOTES for what I’ve been doing.
It does a pretty decent job of validating files and extracting metadata now, but some chunk types are still ignored, and there are some design decisions on the extracted metadata that I’m not sure about yet. Also, JHOVE modules usually have a lot of metadata about themselves, and that’s not complete yet. If anyone wants to play with it, keeping in mind that it’s not stable code yet, please do and submit issue reports for bugs and suggestions.
A few days ago, I started writing a PNG module for JHOVE, partly to keep my Java skills up, partly to help me understand the PNG format. After a while I noticed there already is code for a PNG module and has been for a long time. I must have added it to SourceForge. According to a note in the code, Gian Uberto Lauri at Engineering Ingengeria Informatica S.p.a. created it in 2006. A good amount of work clearly went into it, but it won’t compile. It’s located in a non-source code directory (
extramodules/it/eng/jhove/module/png/PngModule.java), so I had to copy it to src/java to try it out.
The British Library’s Digital Preservation Team has issued a report on WAV Format Preservation Assessment. It cites the broad adoption of WAV and its extension BWF (Broadcast Wave Format) as a positive for preservation purposes and offers only a few cautions. I’m flattered by the recommendation, “Wherever possible and appropriate to the workflow, submitted content should be validated using JHOVE.”
Due to a misunderstanding of mine, there wasn’t a free preview lecture with my course on file format identification tools, even though the promotion video said there was. I’ve rectified that, and the introductory lecture is now available for viewing … Continue reading
Udemy does strange things with course pricing. They’ve put my course on file format identification tools on sale for $10, through January 11. This is the same price as my introductory coupon code, which expired last night.
The only problem is that when you sign up on Udemy without a coupon code, I only get to keep half the money. If you use a coupon code, I keep almost all of it. So in self-defense, I am declaring a price war against myself. With the coupon code PRICEWAR01, you can get the course for just nine dollars! If you use the code, I keep more money at that price than at Udemy’s $10 price, so please use the code.
The code expires January 11, the same day as Udemy’s sale.
Several people have already signed up for my Udemy course on file, ExifTool, DROID, JHOVE, and Tika. It looks as if most of them have taken advantage of the discount code INTRO1 to get it at just $10 and are planning to take it later on. This makes complete sense, since the code is good just till the end of this year. If you’re taking the course, feel free to start a discussion or ask questions; I’ll answer them to the best of my ability. If you’re a specialist in one of these tools and would like to see how I’m teaching it, I’ll offer you a free pass if your credentials are good.
My new video course on Udemy, How to Tell a File’s Format: Five Open Source Tools is now live! This course introduces file, DROID, ExifTool, JHOVE, and Apache Tika, explaining how to install them and use them for format identification. Since I wrote most of the code for JHOVE, the course has some special tips on how to get the most out of it. For each tool, you get instructions on downloading and installing it and a screen capture demo. I’ll be available to help out with any questions.
The standard price for the course is $28, but you can enroll through December 31 for the introductory rate of not $27, not $26, but just $10 US! Use the coupon code INTRO1 to get this rate.
If you’re a currently active developer on any of the tools I mentioned, get in touch with me before December 31 and I’ll get you a free pass in exchange for your feedback.
A new video on my YouTube channel offers a seven-minute introduction to JHOVE. This is a teaser for my upcoming video course on file format identification tools, as well as a public test of the techniques I’ve been developing. It’s a screen capture video, and I cover the GUI version, even if it’s not as widely used, because it lets me focus on the concepts, and because it’s silly to teach a command line application in a video.
The SourceForge repository for JHOVE (which is, by the way, obsolete; here’s the active repository) includes three short reviews which give it five stars and make very generic and identical comments. They’re dated on three successive days. Those are clear signs of sock-puppet accounts.
I can understand why people post glowing but fake reviews to their own project sites, but really, I’m not responsible for these, and I was the only person working on JHOVE at the time, so I can’t imagine who else had an incentive to promote it. Checking on one of these accounts, “rusik1978,” I find similar reviews on many other SourceForge projects. If they linked back to something it would make sense, but they don’t.
I’ve learned from this that sock puppet reviews don’t necessarily prove that the project owner is faking praise. Maybe that’s the point, to make it harder to identify the actual paid reviews?
Posted in commentary