Category Archives: News

FUIF: Yet another image format?

A tweet led me to a pair of articles about a new file format called FUIF. That stands for “Free Universal Image Format.” Jon Sneyers describes it in a series of articles which so far include a Part 1 and Part 2.

It’s “responsive by design”; a single image file can be truncated at various offsets to produce different resolutions. Sneyers says FUIF meets JPEG’s criteria for a new format that provides “efficient coding of images with text and graphics” and “very low file size image coding.”
Continue reading

Introductory JHOVE workshop, January 25, 2019

JHOVE is still alive and active! The Open Preservation Foundation is holding a workshop on “Getting Started with JHOVE” on January 25, 2019 in the Hague, Netherlands. The announcement says, “This workshop is aimed at beginners, or anyone who is new to JHOVE.”

OPF members get priority for registration.

The digital preservation song challenge!

Should there be songs about digital preservation? This is just a special case of the question, “Should there be songs about X?” For nearly all X, the answer is “Yes, and there probably are!” (Even — perhaps especially — if there shouldn’t be, there are.)

Someone in the Australiasian preservation community asked if AusPreserves needed a theme song. The first responses were existing popular songs, but then people started getting more creative. This led to the Digital Preservation Song Challenge!

One response was the Beyonce parody, “All the Corrupt Files” (“Put a checksum on it”). I think it’s the first song ever to mention JHOVE!

Naturally, I already have my own song on digital preservation, called Files that Last. I wrote it to promote my book of the same title, but it stands (or falls) by itself.

If it’s worth doing, it’s worth singing about, and that certainly applies to digital preservation!

Data Transfer Project: New models for interoperability

In spite of improved file standardization, interoperability of data is often a challenge. Say you’ve got a collection of pictures on Photobucket and you want to move them to a different site. You’ve got a lot of manual work ahead. It would be great if there were a tool to do it all for you. The Data Transfer Project aims at making that possible. Some big names are behind it: Facebook, Google, Microsoft, and Twitter. The basic approach is straightforward:

The DTP is powered by an ecosystem of adapters (Adapters) that convert a range of proprietary formats into a small number of canonical formats (Data Models) useful for transferring data. This allows data transfer between any two providers using the provider’s existing authorization mechanism, and allows each provider to maintain control over the security of their service.

Continue reading

DNA as data storage

What’s the oldest data format in the world? It’s not any of the ones that computer engineers developed in the 20th century, or even ones that telegraph engineers created in the 19th. Far older than those — by billions of years — is the DNA nucleotide sequence. We can think of it as a simple base-4 encoding of arbitrary length.

DNA double helix According to the usual, somewhat simplified, description, a DNA molecule is a double helix, with its backbone made of phosphates and sugars, and four types of nucleotides forming the sequence. They are adenine, guanine, thymine, and cytosine, or A, G, T, and C for short. They’re always found in pairs connecting the two strands of the helix. Adenine and thymine connect together, as do guanine and cytosine.

DNA for data encoding

That’s as deep as I care to go, since biochemistry is far away from my areas of expertise. What DNA does is fantastically complicated; change of few bits and you can get a human or a chimpanzee. But as a data model, it’s fantastically simple.
Continue reading

When sloppy redaction fails, resort to censorship

The Broward County School Board used an idiot’s form of “redaction” on a PDF before sending it to the media. The Sun Sentinel removed the blackout layer from the file and found newsworthy information on shooter Nikolas Cruz. They published it. Judge Elizabeth Scherer flew into a rage. She decreed, “From now on if I have to specifically write word for word exactly what you are and are not permitted to print… then I’ll do that.”

That’s called prior restraint, or censorship.
Continue reading

USD and USDZ format for 3D models

Pixar’s USD format allows representation of dynamic 3D scenes. It lets designers create large numbers of objects that fit together into a scene. People on a team can work independently of each other, each designing certain parts. The project is on GitHub.

USD’s design solves the problem of not having to work with one monolithic file (as Pixar did for Toy Story), but sometimes a monolithic file is useful. At WWDC 2018, Apple and Pixar announced a new wrapper for USD, called USDZ. It’s a Zip archive with some special rules. iOS 12 will support it.
Continue reading

The “Zip Slip” vulnerability

Sometimes my reaction to a story is “Wait, are they saying someone was that dumb? … No one could be that dumb! … Oh, gods, they were that dumb!” Naked Security’s account of the Zip Slip vulnerability is just such a story.

The article starts with a fair warning that the vulnerability is “so simple you’ll need to put a cushion on your desk before you read any further (in case of involuntary headdesk injury).” It explains that because of the coding mistake called “Zip Slip,” “attackers can create Zip archives that use path traversal to overwrite important files on affected systems, either destroying them or replacing them with malicious alternatives.” This is where I started to suspect.

The vulnerability isn’t in the Zip format as such, but in bad coding found in some of the zillion ad hoc pieces of software written to unpack Zip files. Have you figured it out yet? I’ll put the cut here to give you a chance to think…
Continue reading

Flash in the Library of Congress’s online archives

Everybody recognizes that Adobe Flash is on the way out. It takes effort to convert existing websites, though, and some sites aren’t maintained, so it won’t disappear from the Web in the next few decades.

When it’s minor or abandoned sites, it doesn’t matter so much, but even the Library of Congress has the issue. Its National Jukebox currently requires a browser with Flash enabled to be useful. Turning on Flash for reliable sites such as the Library of Congress should be safe, at least as long as those sites don’t include third-party ads from dubious sources. Not everyone has that option, though. If you’re using iOS, you’re stuck.

I came across the National Jukebox while doing research for my book project Yesterday’s Songs Transformed, and it’s frustrating that I can’t currently use it without taking steps which I’d rather avoid. The good news is that this is a temporary situation and work is already underway to eliminate the Flash dependency. David Sager of the National Jukebox Team replied to my email inquiry:
Continue reading

File corruption and political corruption

When people who don’t understand file formats manipulate files in order to cover their tracks, they generally fail miserably. Slate magazine gives an entertaining case in point from the Trump scandals. The article says:

There are two types of people in this world: those who know how to convert PDFs into Word documents and those who are indicted for money laundering. Former Trump campaign chairman Paul Manafort is the second kind of person.

The PDF Association chimes in with additional technical details.
Continue reading