While playing with OpenOffice in my research for Files that Last, I came across a preservation risk. I copied an image from a website and pasted it into a text document, then looked at the resulting XML. The image data wasn’t anywhere in content.xml or anywhere else in the overall ZIP document. Instead, I found this:
The source for the image is on the Web. This means that if the URL stops working, the document loses the image. That’s a poor plan for long-term storage.
The way to avoid this is to use Edit > Paste special and paste the image as a bitmap. It can be a pain to remember to do this. You may be able to catch images that are pasted by reference, since there can be a brief delay while just a box with the URL is displayed before the image comes up.
Sneaky little preservation hazards like this (and the earlier one mentioned with Adobe Illustrator files) are the kind of thing you’ll find when Files that Last comes out.