In a previous post, I asked for input on what sort of video metadata FITS should produce, and I’ve gotten a number of responses. Here I’d like to dig in some more and hopefully get more feedback.
We’re talking here primarily about technical metadata. Following the Harvard Library definition, “Technical metadata focuses on how a digital object was created, its format, format-specific technical characteristics, storage and location, etc. Accurate technical metadata helps a repository deliver digital content appropriately to users and to manage digital objects over time and keep them usable.” Other kinds of metadata are important, but the job of FITS is primarily to report what kind of file a digital object is: its format, size, compression method, encoding, etc. Some other kinds of metadata, such as copyright information (rights metadata), are reported, but technical metadata is primary.
I’m interested in a model, not a specific schema or representation. FITS pulls information from many different tools; the question I’m looking into is what properties to report, and with what vocabulary.
For still image metadata, we can go to Exif as a model. For audio, we have the AES standards. In the video realm, things are less settled so far. MPEG-7 offers a standard, but it focuses on the characteristics of the sounds and images, not of the files. It’s technical metadata, but at a level of abstraction which is less relevant to file characterization software.
XMP Dynamic Media looks more useful. It covers low-level technical properties such as frame rate, pixel depth, and compression. The high-level properties for music (no flat keys and only a limited choice of time signatures) are poorly thought out, but they aren’t relevant here.
Some people have recommended MediaInfo‘s output as a model to follow. I’m really not very impressed. Its output isn’t particularly consistent. Archivematica’s list of significant characteristics of video files is more useful, offering some properties which XMP doesn’t cover.
I’ve put together a spreadsheet of properties which represent my current thinking. Suggestions and comments are welcome.