Monthly Archives: April 2016

Data storage meets biotech

With Microsoft’s entry into the field, the use of DNA for data storage is an increasingly serious area of research. DNA is effectively a base-4 data medium, it’s extremely compact, and it contains its own copying mechanism.

DNA molecule representationDNA has actually been used to store data; in 2012 researchers at Harvard wrote a book into a DNA molecule and read it back. It’s still much more expensive than competing technologies, though; a recent estimate says it costs $12,000 to write a megabyte and $200 to read it back. The article didn’t specify the scale; surely the cost per megabyte would go down rapidly with the amount of data stored in one molecule.

Don’t expect a disk drive in a molecule. DNA isn’t a random-access medium. Rather, it would be used to archive a huge amount of information and later read it back in bulk. A wild idea would be to store information in a human ovum so it would be passed through generations, making it literal ancestral memory. Now there’s real Mad File Format Science for you!

Have ebook readers surrendered to DRM?

Late in the first decade of the 21st century, solid opposition to DRM in music made the publishers back down. The arguments were that Digital Rights Management is ineffective against determined crackers, diminishes the value of purchases, and punishes the honest. However, readers have largely caved in to DRM in ebooks, even though the arguments are equally valid there, as confirmed by experience. Some publishers don’t use DRM, and some authors don’t let their publishers use it, but a large proportion of e-books are restricted by encryption. I can’t find any figures on the proportion of sales, but DRM is the default with many major distributors, such as Amazon, and many readers just don’t seem to care.

'DRM Free' logoPeople are beginning to notice, though, that they don’t own a book under DRM; they only license it as long as the vendor supports it. The Sony Reader is dead. Barnes and Noble’s Nook is dying by slow stages. Many smaller publishers are issuing their books DRM-free; it’s mostly the biggest publishers that restrict access. My own e-books, Files that Last and Tomorrow’s Songs Today, are unencumbered by DRM.

Removing DRM isn’t hard. You can find lots of pages with information on how to do it. I don’t know which ones work safely, since I don’t buy ebooks with DRM in the first place.

If Simon and Schuster had its way, it could sue me for huge amounts of money for posting that link, but a federal judge ruled against it. So I’m legal in providing you that link. I hope.

I also hope that eventually the big publishers will figure out that they’re only hurting their readers and losing business by restricting users’ ability to save and convert the books they download.

Designing to the demo is a mistake

A lot of software design clearly aims not at providing the best experience to the user, but at providing the most impressive demo. Apple does this all the time, or at least that’s the only explanation I can think of for their design decisions. Getting people to applaud in amazement doesn’t get loyal customers if the product is terrible in everyday use, though.

My current Garmin car GPS device is a good example of this. To enter an address, you enter first the street number, then the street, and finally the locality and state together. This sounds very natural, much better than my old device where you started with the state and worked down to the street number. The trouble is that when you use the new device, you find that auto-completion is useless.
Continue reading

More what you’d call guidelines than actual rules

Do pirate sites have rules? Apparently so, according to Beta News. It tells us that sites like Pirate Bay have “fairly strict rules dictating capturing, formatting and naming releases” and “astoundingly lengthy standards documents covering standard and high definition releases of TV shows.” These rules “mandate” a switch from MP4 to the open Matroska (MKV) format as of April 10, so they’re stricter than the Pirates of the Caribbean.

I have no love for pirate sites. They play up their reputation for making stuff from big, evil, litigious companies available, but they’ll grab anything they can get their hands on, including music by small, independent artists who are having a hard enough time making a living. A couple of sites have even grabbed my filk recordings, which have no market beyond a couple of hundred people. But I’m amused that pirates have their own strict rules, and a move anywhere toward open formats can’t be a bad thing.

Update on JHOVE

I Aten't DeadI’ve received an email reply from Becky McGuiness at Open Preservation Foundation to my query about JHOVE’s status. She says that VeraPDF has been taking all the development resources, as I suspected, but that work on JHOVE (in particular, fixing the expired installer) will resume soon.

Update: Here’s a response from Carl Wilson at OPF on the status of JHOVE. It says that the next version will jump from 1.12 to 1.14 (triskaidekaphobia?) and will include several new modules, including my PNG module.

I’ll second Carl’s call for institutions to become OPF supporters. As someone on Twitter said recently, open source software is “free, as in kittens.” It costs money to maintain it. Occasionally people support free software for the sheer love of it, but developers do need to earn a living.

Update 2: OPF reports that JHOVE installer has been fixed.

The end of UDFR

The Unified Digital Format Registry (UDFR), created and maintained by the California Digital Library, will shut down on April 15, 2016. I don’t know whether the whole site will go away or just the ability to query the registry.

Information Standards Quarterly has an article on UDFR by Andrea Goethals. The source code repository is on GitHub.

The predecessor project, GDFR, never got to publicly usable status. The site still responds to pings, but apparently not to HTTP requests.

Quoting its description here, so it’s saved in at least one place if the site completely goes away:

The UDFR is a reliable, publicly accessible, and sustainable knowledge base of file format representation information for use by the digital preservation community.

A format is a set of semantic and syntactic rules governing the mapping between abstract information and its representation in digital form. While many worthwhile and necessary preservation activities can be performed on a digital asset without knowledge of its format, that is, merely as a sequence of bits, any higher-level preservation of the underlying information content must be performed in the context of the asset’s format.

The UDFR seeks to “unify” the function and holdings of two existing registries, PRONOM and GDFR (the Global Digital Format Registry), in an open source, semantically enabled, and community supported platform.

The UDFR was developed by the University of California Curation Center (UC3) at the California Digital Library (CDL), funded by the Library of Congress as part of its National Digital Information Infrastructure Preservation Program (NDIIPP). The service is implemented on top of the OntoWiki semantic wiki and Virtuoso triple store.

Want FLAC on your Mac? Try Vox

Vox application windowiTunes is horrible and keeps getting worse. The current version has come down with dyslexia; it can’t even play my files in order. On top of that, it supports a poor range of file formats, knowing nothing about popular open formats like FLAC and Ogg Vorbis. QuickTime Player has a saner user interface but the same format limitations. If you want to play music in those formats, you need to look for other software. I’ve just grabbed Vox for OS X, and it handles those files nicely.

It’s not an iTunes replacement, even if all you want to do is play music that’s stored on your computer. You can import your iTunes library, but you can’t view the contents of your playlists (which it calls “collections”) or select items from them. What it does let you do, though, is play FLAC, AAC, ALAC (Apple Lossless), Ogg, MP3, and APE files.
Continue reading

Is JHOVE dead in the water again?

See this post for important updates.

JHOVE logoIn December, JHOVE 12.0 was very close to a release. Since then, next to nothing has happened. The installer for the beta version expired, and there’s been an update for that. A couple of pull requests have been merged. Otherwise — nothing.

I think what’s happened is that the Open Preservation Foundation’s very limited resources were pulled onto VeraPDF. That’s certainly a worthwhile endeavor, but it irks me that I handed support of JHOVE over to OPF only to see the ball dropped. I did some work on a PNG module a month ago and submitted a pull request; nothing’s happened since then.

I wouldn’t mind picking JHOVE up agin, but I’m going to be blunt about this: I’m done with working on it for free. If institutions that want JHOVE to be maintained really care about it, they should put up some money, whether it’s to OPF, to me, or to someone else. Open source software isn’t something that magically happens because people love to work without pay.

When do the MP3 patents expire?

MP3 logoWhy exactly is MP3 still popular? It’s not as efficient as more recent compression methods, and it’s encumbered by patents. People keep using what’s familiar. In a few years, it may become patent-free.

A Tunequest piece from 2007 lists several expiration dates that are still in the future:
Continue reading