With the next SPRUCE Hackathon coming up, I’m thinking of possible ways to improve JHOVE that I might present there. The home page says, “This hackathon will therefore focus on unifying our community’s approach to characterisation by coordinating existing toolsets and improving their capabilities.” So aside from the general goal of improving JHOVE, coordination is a key point.
I’d posted earlier on some possible enhancements. These are all still possibilities. The focus on coordination brings up other things that could be done. In general, the API hasn’t been given as much thought as the command line interface, and it could be improved without a huge amount of effort. Here are a few thoughts:
- The API currently requires creating an output stream, such as an XML or text file. It should be possible to call JHOVE and get back an in-memory object. The RepInfo object already serves this purpose; it’s mostly a matter of writing a new method that returns it instead of writing a stream.
- The caller has the choice of running one module or all the modules in the configuration file and can’t change their order. It might improve efficiency if the caller could specify a list indicating the modules to try and the order in which they should be applied. For instance, a caller might use DROID to get the signature and use this information to pick the module that JHOVE should run first.
- There’s currently no provision for selecting which output items to generate, except for a few ad hoc options. Would a way to do this, eliminating items that are unwanted, be helpful?
- Would any additional output handlers, such as JSON, be useful?
I’d welcome any thoughts on which of these, or what other changes, would help JHOVE to coordinate with other applications.