Preserving Yahoo Groups

Yahoo is sending out alerts on the transformation of Yahoo Groups into a list server. The spin is ridiculous. The changes “better align with user habits,” and “we are making adjustments to ultimately serve you better.” It’s as if users had been protesting against the existence of public groups and Web-hosted discussions and Yahoo were complying with the demand.

Yahoo, in case you haven’t been keeping track (I hadn’t), now belongs to Verizon. It makes economic decisions, and one was that running public Yahoo Groups was no longer worth the cost and effort. This is the result of changing user preferences, as well as stupid policy decisions over the years that drove people away. The attempts to correct those blunders may be part of the current problem.
Continue reading

Nefertiti, now available as a 3D scan

Bust of Nefertiti, from 3D scan, Egyptian Museum of BerlinOne of my favorite areas in Berlin is the Museum Island. It includes the Egyptian Museum, which is part of the Neues Museum. Among its most famous possessions is a bust of Nefertiti which dates from about 1340 BCE. The museum has an entire room dedicated to Nefertiti.

More relevant to this blog, it has made a detailed 3D scan of the bust. The museum belongs to the Prussian Cultural Heritage Foundation, which is funded by the federal government and the 16 state governments. Supposedly it has an obligation to make its information public, but for reasons that aren’t clear, it held tight to that scan for a long time. It’s now available as a free download, ten years after it was made, thanks to the persistent efforts of Cosmo Wenman. He tells the story on Reason.com.
Continue reading

Aside

I have removed all my profiles on Stack Exchange/Stack Overflow because of the way it has treated its people.

Finale and macOS

I’m not entirely sure where the right place to put this is. It’s a file format issue in part, since if people can’t keep using Finale after a macOS upgrade, they need to salvage all the files they’re created in its proprietary format.

The email which I got from MakeMusic, dated October 18, was alarming:

Finale v25.5 is not compatible with macOS 10.15 Catalina and will not be updated to support Catalina. It is our recommendation that users of Finale v25.5 not upgrade to macOS Catalina.
Continue reading

Identifying files by programming language

Most of today’s programming languages look vaguely similar. They’re derived from the C syntax, with similar ways of expressing assignments, arithmetic, conditionals, nested expressions, and groups of statements. If the files have their original extension and it’s accurate, format identification software should be able to classify them correctly.

The software should do some basic checks to make sure it wasn’t handed a binary file with a false extension, which could be dangerous. A code file should be a text file. regardless of the language. (This isn’t strictly true, but non-text languages like Piet and Velato are just obscure for the sake of obscurity.) The UK National Archive recognizes XML and JSON (which is a subset of JavaScript) but doesn’t talk about programming languages as file formats. Exiftool identifies lots of formats but makes no attempt to discern programming languages.
Continue reading

The Apostles of GIF

Biblical picture, Paul and BarnabasThis is as much an excuse to plug one of my favorite satirical websites, the Babylon Bee, as anything else. They’ve got a mock-historical article claiming that the apostles Paul and Barnabas parted ways over the pronunciation of GIF.

It’s hard to tell reality from satire these days, so I should say again that the Babylon Bee is strictly satirical. I think it’s funnier than the Onion.

To me it’s clear that the Shakers got it right when they went with the hard G. You know the song: “‘Tis a GIF to be simple, ’tis a GIF to be free…”

Fileformat.com

In my recent searches, I came across Fileformat.com, which presents itself as a guide for developers. There’s no information on the site about who’s running it, though most or all of the articles on the wiki are credited to Farooq Sheikh. The site looks worth following. The main sections of it are:

  • A wiki on file formats. It isn’t as thorough as the Archive Team wiki, but it has some good technical information on the most popular formats.
  • A news section, which consists of links to articles on other sites, including some of mine. Not all of them are strictly news, but they’re all relevant to people with a specialty in file formats. It has an RSS feed, though it isn’t advertised. There aren’t a lot of RSS feeds on file formats (besides the feed for this blog, of course), so it could be worth bookmarking in your reader.

I’ve added a link to the site in my sidebar.

Zip bombs: Blown up out of proportion?

A Vice.com article has brought fresh publicity to an old trick. The so-called “Zip bomb” is a Zip file with a fantastically high compression ratio. Researcher David Fifield created a 46-megabyte file that expands into 45 petabytes. That’s a compression ratio of about a billion. Fifield’s own article provides a lot more technical information.

The article says such files are “so deeply compressed that they’re effectively malware.” That strikes me as a bit of an exaggeration. “Nuisanceware” seems more accurate, if there’s such a word. However, they could be used in a denial of service attack. They could crash a server or browser, and the work removing the expanded files could cause some downtime. A Zip bomb might be a setup for another attack, tying up system resources and distracting administrators.
Continue reading

The tape obsolescence problem

An ABC News Australia article calls attention to the problem of archives on magnetic tape. Author James Elton clearly knows something about digital preservation issues, as the article goes beyond the usual generalities and hand-wringing.

Tapes, on the other hand, can only be read by format-specific machines.

And dozens of formats of magnetic tape were created through the last century — one-inch, two-inch, various versions of Betamax.

Continue reading

Web archiving and languages

Web archiving is difficult. Few sites consist entirely of static, self-contained content. Most use JavaScript, often from external sites. Responsive pages are designed to look different in different environments. An archive needs to make a snapshot that reflects its appearance at a given point in time, but what exactly does that mean? Should an archive pick an appearance for one reasonable set of parameters, or should it try to keep the page’s dynamic nature? Will the fact that it’s an archive rather than an interactive browser affect what the server gives it?
Continue reading