Tag Archives: compression

Zip bombs: Blown up out of proportion?

A Vice.com article has brought fresh publicity to an old trick. The so-called “Zip bomb” is a Zip file with a fantastically high compression ratio. Researcher David Fifield created a 46-megabyte file that expands into 45 petabytes. That’s a compression ratio of about a billion. Fifield’s own article provides a lot more technical information.

The article says such files are “so deeply compressed that they’re effectively malware.” That strikes me as a bit of an exaggeration. “Nuisanceware” seems more accurate, if there’s such a word. However, they could be used in a denial of service attack. They could crash a server or browser, and the work removing the expanded files could cause some downtime. A Zip bomb might be a setup for another attack, tying up system resources and distracting administrators.
Continue reading

Are uncompressed files better for preservation?

How big a concern is physical degradation of files, aka “bit rot,” to digital preservation? Should archives eschew data compression in order to minimize the effect of lost bits? In most of my experience, no one’s raised that as a major concern, but some contributors to the TI/A initiative consider it important enough to affect their recommendations.
Continue reading

ZIP standardization

The ZIP format is widely used, both by itself and as part of other widely used formats such as ODF, yet it’s never been standardized. Caroline Arms of the Library of Congress has informed the JHOVE2 list that there’s a new study group under ISO/IEC JTC1 SC34 WG1, which is looking into the standardization of ZIP. There is a Wiki for this study as well as a mailing list archive.

Membership in the group requires going through the appropriate national standards group.

What is the status of ZIP?

Is the ZIP format in the public domain? Partly? Completely? Not at all? See an interesting discussion by Rick Jelliffe.

What ever happened to .SIT?

With the increasing use of ZIP compression on the Macintosh, the Stuffit or .SIT format has fallen into relative obscurity. But not only is it still around, its publishers claim it’s “the ultimate in compression.” Five to ten years ago, lots of computer products were promoted as “the ultimate.” But when the next revision is the new “ultimate,” and so is the one after that, the claim starts to look ridiculous, and most advertisers have dropped it.

Stuffit’s compression is, according to most studies, about as good as competing technologies. It has no claim on being “the ultimate.” Its ad in the MacConnection catalogue says that “Stuffit Deluxe(R) 2009 can compress files up to 98% of their original size.” This is a nicely ambiguous claim; does that mean that the compressed file is reduced by 98%, or that it’s 98% of its original size? The latter isn’t hard to achieve at all, and hardly worth bragging about. But it’s extremely rare that Stuffit, or any other compression, can reduce a file to 2% of its original size. Perhaps a file of all 1’s would get 98% reduction, but that’s seldom useful.

Stuffit once had the advantage of recognizing the two-fork file format of the Macintosh Classic OS. But now that virtually everyone has gone to OS X, which doesn’t use dual file forks, it’s just one more compression format.