The Java file format API graveyard

If you look for Java libraries to support specific file formats, you’ll soon come upon the gloomy graveyard of Java APIs. Sun and Oracle have a history of devising nice packages for reading and writing different kinds of files, only to abandon their maintenance. You can still find pages for them, and it takes a close look to figure out that they aren’t supported any more.

Java Advanced Imaging (JAI) was nice in its time. It still has a page on Oracle’s website, but the latest “what’s new” item is dated 2007. The page brags about customer success stories as if it were still usable code. I’ve tried working with it. It’s out of sync with the current com.sun classes, and I got only limited use out of it. In its time it was a good way to read and write image files.

Java Media Framework (JMF) runs on a 166 MHz Pentium or 160 MHz PowerPC. The downloaded jars are dated May 1, 2003. It had a nice list of supported formats.

If you’re working with audio files, javax.sound looks more encouraging. Its API is listed with Java 8. The class java.sound.sampled.AudioSystem supports reading and writing of audio files. I can’t find a list of the supported formats.

Java does reliably support some formats. Its handling of text encodings is versatile, and java.util.zip handles ZIP and GZIP.

Third-party code can come to the rescue. For reading and writing PDF, Apache PDFBox looks like the best bet. You can use Apache Tika with lots of formats, if you just need to extract metadata. Another alternative is to use ImageMagick, but it runs natively rather than under the JVM, so you have to invoke it with exec calls. im4java and JMagick can save some of the tedium. There are open source Java libraries for reading and writing specific file formats. Some may be good, some not.

If you need to deal with the guts of file formats in Java, you’ll usually have to find some good third-party code or start writing your own.

One response to “The Java file format API graveyard

  1. As far as I know, the ‘offical’ image read/write API for Java is now ImageIO: https://docs.oracle.com/javase/7/docs/api/javax/imageio/package-summary.html

    Also, there is a decent Apache project on image format handling that could be a good starting point: https://commons.apache.org/proper/commons-imaging/formatsupport.html

    Having said that, Apache Commons Imaging is not terribly active either. I think most of the attention from developers to open source image read/writing/manipulation tools goes to things like ImageMagick and the libraries it depends on (libtiff etc etc).