The disappearing format blues

Old formats sometimes fade into obscurity and can no longer be supported, even if they come from a big company like Microsoft. Chris Rusbridge has noted that Microsoft’s Open Specifications page only goes as far back as Office 97, and that PowerPoint 4.0 files can’t be opened with today’s Microsoft Office. Tony Hey at Microsoft has replied. (Hey is vice president of Microsoft Research Connections). The response was encouraging, particularly in suggesting that Microsoft might “participate in a ‘crowd source’ project working with archivists to create a public spec of these old file formats.”

There’s usually some kind of software around that can read old formats. A search turns doesn’t turn up a lot; there’s something called PowerPressed, which will wrap old PowerPoint files in a .exe application. It looks as if it should run on current Windows systems, but all I know is what that page says.

The situation shows the risk of using a format that isn’t publicly documented. Today this is less of a problem. I think it’s been shown that publishing format specs doesn’t lead to cannibalization of sales by competing software; the company that created the spec is in a position to produce the best implementation. The description of PDF is fully public, and Adobe still dominates the market for PostScript software. Publishing the spec has just made the pie bigger. There’s still quite a lot of software that uses unpublished proprietary specs, though, and it’s risky to rely on the long-term reliability of the files they produce.

One response to “The disappearing format blues

  1. There is an ongoing attempt to describe some Microsoft formats in XML from which source code can be generated:

    http://gitorious.org/msoscheme

    Current iteration generates Java and C++. The programming language part of the code is very small compared to the part that describes the file format.