My article on “File Format Analysis Tools for Archivists” is up on LWN.net.
- Follow Mad File Format Science on WordPress.com
The UK National Archive quietly released DROID 6.2 this month. I noticed only because of some mentions on Twitter. The file dates indicate the update was released on February 16. Here’s the new portion of the changelog:
Due to a misunderstanding of mine, there wasn’t a free preview lecture with my course on file format identification tools, even though the promotion video said there was. I’ve rectified that, and the introductory lecture is now available for viewing … Continue reading
Udemy does strange things with course pricing. They’ve put my course on file format identification tools on sale for $10, through January 11. This is the same price as my introductory coupon code, which expired last night.
The only problem is that when you sign up on Udemy without a coupon code, I only get to keep half the money. If you use a coupon code, I keep almost all of it. So in self-defense, I am declaring a price war against myself. With the coupon code PRICEWAR01, you can get the course for just nine dollars! If you use the code, I keep more money at that price than at Udemy’s $10 price, so please use the code.
The code expires January 11, the same day as Udemy’s sale.
Several people have already signed up for my Udemy course on file, ExifTool, DROID, JHOVE, and Tika. It looks as if most of them have taken advantage of the discount code INTRO1 to get it at just $10 and are planning to take it later on. This makes complete sense, since the code is good just till the end of this year. If you’re taking the course, feel free to start a discussion or ask questions; I’ll answer them to the best of my ability. If you’re a specialist in one of these tools and would like to see how I’m teaching it, I’ll offer you a free pass if your credentials are good.
The PRONOM file format signature files were updated on December 17. DROID users should make sure they have the latest files.
My new video course on Udemy, How to Tell a File’s Format: Five Open Source Tools is now live! This course introduces file, DROID, ExifTool, JHOVE, and Apache Tika, explaining how to install them and use them for format identification. Since I wrote most of the code for JHOVE, the course has some special tips on how to get the most out of it. For each tool, you get instructions on downloading and installing it and a screen capture demo. I’ll be available to help out with any questions.
The standard price for the course is $28, but you can enroll through December 31 for the introductory rate of not $27, not $26, but just $10 US! Use the coupon code INTRO1 to get this rate.
If you’re a currently active developer on any of the tools I mentioned, get in touch with me before December 31 and I’ll get you a free pass in exchange for your feedback.
The last installment in this series looked at
file, a simple command line tool available with Linux and Unix systems for determining file types. This one looks at DROID (Digital Record Object IDentification), a Java-based tool from the UK National Archives, focused on identifying and verifying files for the digital repositories of libraries and archives. It’s available as open source software under the New BSD License. Java 7 or 8 is needed for the current release (6.1.5). It relies on PRONOM, the National Archive’s registry of file format information.
file, DROID depends on files that describe distinctive data values for each format. It’s designed to process large batches of files and compiles reports in a much more useful way than
file‘s output. Reports can include total file counts and sizes by various criteria.
To install DROID, you have to download and expand the ZIP file for the latest version. On Windows, you run droid.bat; on sensible operating systems, run droid.sh. You may first have to make it executable:
chmod +x droid.sh
Running droid.sh with no arguments launches the GUI application. If there are any command line arguments, it runs as a command line tool. You can type
to see all the options.
The first time you run it as a GUI application, it may ask if you want to download some signature file updates from PRONOM. Let it do that.
It’s also possible to use DROID as a Java library in another application. FITS, for example, does this. There isn’t much documentation to help you, but if you’re really determined to try, look at the FITS source code for an example.
DROID will report file types by extension if it can’t find a matching signature. This isn’t a very reliable way to identify a file, and you should examine any files matched only by extension to see what they really are and whether they’re broken. It may report more than one matching signature; this is very common with files that match more than one version of a format.
It isn’t possible to cover DROID in any depth in a blog post. The document Droid: How to use it and how to interpret your results is a useful guide to the software. It’s dated 2011, so some things may have changed.
It’s been a problem for a while that DROID 6 won’t run under Java 7. Matt Palmer has reported a simple fix for this, requiring only a change in pom.xml. Hopefully a release incorporating this change will appear soon.
According to a post on the DROID mailing list, DROID is not currently compatible with JRE 7. An issue with the Spring framework appears to be the cause. The next release of DROID should support Java 7.