The (information) machine stops

The “Digital Dark Age” discussion has started up again on Twitter, and again I find myself in the minority position. It really is possible to have Twitter discussions on complex topics and say something intelligent, but it isn’t easy. More than 140 characters at a time are needed, and it’s been a while since I last wrote about the subject at length, so let’s get back to it. The last post that I wrote on this was “Dataliths vs. the Digital Dark Age”, and I hope you’ll read that before continuing here, since I don’t want to just repeat myself.

Maybe the question needs to be turned around. Let’s not ask what could trigger a Digital Dark Age, but what conditions are necessary and sufficient for the really long-term preservation of information, what will minimize the risk of widespread loss of today’s history, literature, and daily news?

Today we can store more information than ever before, but its durability has gone down. File formats become obsolete. So do storage devices, and they fail physically. Anything we put on a disk today will almost certainly be unusable by 2050. The year 3016 just seems unimaginably far. Yet we still have records today from 1016, 16, and even 984 B.C.E. How can our records of today last a thousand years?

The usual answer is continual curation. Every five or ten years, information needs to be migrated to new storage and perhaps updated to a new format. This assumes that institutions will operate without interruption over an indefinite period. In its most naive form, it’s expressed in Andy Ihnatko’s remark that “Amazon will be around forever.” More sophisticated variants suggest that multiple institutions around the world will maintain the task of preservation, each using redundant and geographically distributed information — and that at least some of these will be around forever.

Both versions rely on an unbroken chain of human activity to keep information alive. Stewardship will unfailingly pass from generation to generation. To suppose it could fail is “alarmism,” a.k.a. paying attention to history.

We have as much information from the ancient world as we do because of its ability to survive neglect. Only a tiny fraction has survived. Works by Homer, Aristotle, Biblical authors, and many others are irretrievably gone, and much of what we do have is the result of lucky accidents, with documents placed where they wouldn’t deteriorate. Even when monks conscientiously kept documents safe in their monasteries, all they had to do was protect them from damage, not actively update them. Today’s leading forms of digital storage simply can’t survive that degree of neglect.

In When We Are No More, Abby Smith Rumsey writes:

The old paradigm of memory was to transfer the contents of our minds onto a stable, long-lasting object and then preserve the object. If we could preserve the object, we could preserve our knowledge. This does not work anymore. We cannot simply transfer the content of our minds to a machine that encodes it all into binary script, copy the script onto a tape or disk or thumb drive (let alone a floppy disk), stick that on the shelf, and expect that fifty years from now, we can open that file and behold the contents of our minds intact. Chances are that file will not be readable in five years, and certainly far less if we do not check periodically to see that it has not been corrupted or that the data need to be migrated to fresher software. The new paradigm of memory is more like growing a garden. Everything that we entrust to digital code needs regular tending, refreshing, and periodic migration to make sure that it is still alive, whether we intend to use it in a year, a hundred years, or maybe never.

The metaphor discloses the weakness of the solution. A garden is a very fragile thing. Weeds will overrun it if it’s neglected for a year or two. A new owner or a highway authority might pave it over. How many of the gardens of a hundred years ago still exist?

People assume that things will always be the way they are today, maybe with some gradual improvement or decline, but nothing that will seriously disrupt the way we and future generations live. That’s not a safe assumption. Relatively free countries can become repressive police states that destroy information. Trusted institutions can disappear. Societies can go through periods where reliable food and shelter are the most many people can hope for. It would be nice to believe that generations of digital curators will stay aloof through all of this, finding ways to maintain their charge, but “alarmist” as it may be to say it, the succession is bound to fail someday. The monks of the Middle Ages had their Vikings, today Daesh is destroying artifacts where it can, and we don’t know what our descendants might face.

At the same time, we have to remember that people and information have survived all of these catastrophes. I’ve seen the word “apocalypse” a lot in the narrative against these concerns, suggesting that if such things happen, they’ll be so final that it’s useless to think beyond them. It’s probably not worth thinking about how to preserve information in the event that the whole world is pushed back into the Stone Age. The vagaries of time, widespread failures from which people eventually recover, are more deserving of consideration.

According to legend, the Library of Alexandria was destroyed in one big fire. This isn’t true. A series of events over many hundreds of years brought it down, and we don’t even know exactly when it ceased to exist. Thinking of a one-time apocalypse may be more comforting in an odd way; you can’t do anything about it, so you don’t have to do anything about it.

Let’s go back to the question I formulated. If an uninterrupted succession of custodians isn’t the best way to keep history alive, what is? The answer must be something that’s resilient in the face of interruptions. It isn’t necessary or even possible to guarantee that most information will survive; what’s important is to avoid reliance on fragile protection. Gardens are fragile, but plant life is much more durable than any garden, as long as there are seeds that can grow somewhere. The keys are durability and decentralization.

If anyone can make a digital record at low expense, store it away for a long time, and reasonably expect it to be useful when it’s retrieved, and if a lot of people do it, that solves the problem. The first part is easy; costs of digital storage are ridiculously low. The hard parts are avoiding physical degradation, hardware obsolescence, and format obsolescence. Physical durability isn’t out of reach. Devices like the M-disc have impressive durability. The way to address obsolescence is with designs simple enough that they can be reconstructed. As long as the bits can be identified, ASCII characters and hopefully English should be recognizable as long as civilization retains some level of continuity. They can be used to describe more complex structures. One example of how people might do this is presented in “The Cuneiform Tablets of 2015,” by Long Tien Nguyen and Alan Kay. It focuses on emulation, which is one of several possibilities. Hitachi’s quartz glass data storage sounds promising; unfortunately, it appears nothing has happened on this since 2014.

Having dataliths in lots of hands is important. If only big, nationally supported institutions retain our records, it’s too easy to shut them down or bring pressure on them to erase unwanted parts of history. We need archives in many hands, with their own quirks and biases, so no one approach controls what’s retained.

The problem is solvable. The mistake is thinking that an indefinite chain of short-term solutions can add up to a long-term solution.

