How big a concern is physical degradation of files, aka “bit rot,” to digital preservation? Should archives eschew data compression in order to minimize the effect of lost bits? In most of my experience, no one’s raised that as a major concern, but some contributors to the TI/A initiative consider it important enough to affect their recommendations.
Some people in the TI/A discussion argue against accepting compressed files as archival quality TIFF, because of their greater susceptibility to bit rot. In an uncompressed file that isn’t tiny, most of the data will be pixels, and flipping a bit will most likely just change a single pixel. Flipping a bit in a compressed data stream can mess up the decompression algorithm so that a large part of the image is damaged, or the application may crash. The argument is that a slightly damaged file is better than a seriously damaged one.
This theory looks like a bad one to me. First, it implies that the archive will trust damaged files to some extent. An uncompressed file with bit damage may just have a bad pixel, but the damage could be in the file header, the tags, or the ICC profile, seriously damaging the file or making it unusable. Second, the risk of bit damage to an uncompressed file is greater, simply because it’s bigger. At the same time, it takes up more storage space, so the archive can’t do as much backing up on a given budget. Lossless compression (LZW or ZIP) often reduces a file to less than half its original size, which means that an original file and a backup can be stored in the same amount of space as an uncompressed file.
Not all compression is equal. Disallowing lossy compression in archival TIFF files may make sense for other reasons, and TIFF’s original JPEG compression scheme is deprecated. But insisting on uncompressed files to improve their ability to withstand bit rot strikes me as a foolish precaution.