Exif is a metadata standard used in several different graphics formats. ExifTool by Phil Harvey analyzes Exif data, as you’d expect, but it does a huge number of other things besides. It’s an open source library and command line tool for identifying and extracting metadata from image, movie, and audio files of close to 200 different formats, as well as editing the metadata. It’s written in Perl and is available under the Perl licensing terms. The information it provides is impressive; as I mentioned in an earlier entry on this blog, I used it to identify the difference in files that Honda’s MP3 player failed to handle.
It’s more specialized than the tools I’ve previously discussed. It doesn’t know what to do with a simple ASCII file, so you don’t want to run it to identify a batch of arbitrary files. But within its domain, ExifTool very impressive. It digs deeper into files than file
or DROID does, so it offers more confidence that a file is what it claims to be and isn’t badly broken. Still, experimentation with deliberately damaged files shows that ExifTool tries to do what it can with them; it’s not a validation tool.
If there’s specific information you need beyond the file type, ExifTool can be very useful. If you’re just interested in a couple of values, you can filter the output with grep
. For instance, let’s say you know that a directory contains JPEG files, but you want to know what color space they use. This command will do it:
exiftool -S DirectoryOfJPEGs | grep "FileName:\|ColorSpace:"
Developers can use the ExifTool Perl library to extract information on files. From a non-Perl application, though, it may be easier to invoke the command-line application, which can return metadata in a variety of formats. This is the approach that FITS uses.
Next: JHOVE. To read this series from the beginning, start here.