Category Archives: News

Recreating Clarke’s “The Sentinel” in real life

Plexiglass monolithLunar Mission One, a private nonprofit organization, is trying to recreate Arthur C. Clarke’s “The Sentinel” (the inspiration for the movie 2001) in real life. They hope to send a digital archive to the moon in 2024 and bury it there. As long as whatever is stored there can withstand intense cold, it should last a very long time.

The plan calls for two archives. One would contain items privately provided by people paying to have their data stored on the moon; the other would be a history of humanity. CEO David Iron (no relation to Tony Stark) raises the question of how living beings of the future will find it and says, “We need a permanent sign that will last for a billion years. … We need to invert the normal logic of searching for extra-terrestrial intelligence by transmitting; they can come to us.”
Continue reading

Floppies aren’t dead

Today’s exciting news on Twitter is that one or more of the Department of Defense systems used to coordinate ICBMs and nuclear bombers still use 8-inch floppy disks. A spokesperson for the DoD explained, “It still works.” The computer is an IBM Series/1 that dates from the seventies.
Continue reading

Tim Berners-Lee on “trackable” ebooks

Ebooks of the future, says Tim Berners-Lee, should be permanent, seamless, linked, and trackable. That’s three good ideas and one very bad one.

Speaking at BookExpo America, he offered these as the four attributes of the ebooks of the future. They’ll achieve permanence through encoding in HTML5, which is what EPUB basically is. Any ebook that’s available only in a proprietary format with DRM is doomed to extinction. Pinning hopes on Amazon’s eternal existence and support of its present formats is foolish. Seamlessness, the ability to transition through different platforms and content types, follows from using HTML5. This is reasonable and not very controversial.
Continue reading

JHOVE 1.14

The Open Preservation Foundation has just announced JHOVE 1.14. The numbering is a bit odd. Version 1.12 never made it to release, and they seem to have skipped 1.13 entirely.

This includes three new modules: the PNG module, which I wrote on a weekend whim, and GZIP and WARC modules adapted from JHOVE2. The UTF-8 module now supports Unicode 7.0.

The release isn’t showing up yet on the OPF website, but I expect that will happen momentarily.

It’s nice to see that the code which I started working on over a decade ago is still alive and useful. Congratulations and thanks to Carl Wilson, who’s now its principal maintainer!

More what you’d call guidelines than actual rules

Do pirate sites have rules? Apparently so, according to Beta News. It tells us that sites like Pirate Bay have “fairly strict rules dictating capturing, formatting and naming releases” and “astoundingly lengthy standards documents covering standard and high definition releases of TV shows.” These rules “mandate” a switch from MP4 to the open Matroska (MKV) format as of April 10, so they’re stricter than the Pirates of the Caribbean.

I have no love for pirate sites. They play up their reputation for making stuff from big, evil, litigious companies available, but they’ll grab anything they can get their hands on, including music by small, independent artists who are having a hard enough time making a living. A couple of sites have even grabbed my filk recordings, which have no market beyond a couple of hundred people. But I’m amused that pirates have their own strict rules, and a move anywhere toward open formats can’t be a bad thing.

Update on JHOVE

I Aten't DeadI’ve received an email reply from Becky McGuiness at Open Preservation Foundation to my query about JHOVE’s status. She says that VeraPDF has been taking all the development resources, as I suspected, but that work on JHOVE (in particular, fixing the expired installer) will resume soon.

Update: Here’s a response from Carl Wilson at OPF on the status of JHOVE. It says that the next version will jump from 1.12 to 1.14 (triskaidekaphobia?) and will include several new modules, including my PNG module.

I’ll second Carl’s call for institutions to become OPF supporters. As someone on Twitter said recently, open source software is “free, as in kittens.” It costs money to maintain it. Occasionally people support free software for the sheer love of it, but developers do need to earn a living.

Update 2: OPF reports that JHOVE installer has been fixed.

The end of UDFR

The Unified Digital Format Registry (UDFR), created and maintained by the California Digital Library, will shut down on April 15, 2016. I don’t know whether the whole site will go away or just the ability to query the registry.

Information Standards Quarterly has an article on UDFR by Andrea Goethals. The source code repository is on GitHub.

The predecessor project, GDFR, never got to publicly usable status. The site gdfr.info still responds to pings, but apparently not to HTTP requests.

Quoting its description here, so it’s saved in at least one place if the site completely goes away:

The UDFR is a reliable, publicly accessible, and sustainable knowledge base of file format representation information for use by the digital preservation community.

A format is a set of semantic and syntactic rules governing the mapping between abstract information and its representation in digital form. While many worthwhile and necessary preservation activities can be performed on a digital asset without knowledge of its format, that is, merely as a sequence of bits, any higher-level preservation of the underlying information content must be performed in the context of the asset’s format.

The UDFR seeks to “unify” the function and holdings of two existing registries, PRONOM and GDFR (the Global Digital Format Registry), in an open source, semantically enabled, and community supported platform.

The UDFR was developed by the University of California Curation Center (UC3) at the California Digital Library (CDL), funded by the Library of Congress as part of its National Digital Information Infrastructure Preservation Program (NDIIPP). The service is implemented on top of the OntoWiki semantic wiki and Virtuoso triple store.

HTML5 and DRM

logo, 'DRM' with XIf anything causes more controversy than DRM (digital rights management), it’s joining DRM with an open standard. The World Wide Web Consortium’s Encrypted Media Extensions Working Draft is generating controversy in plenty.

Cory Doctorow has declared: “The World Wide Web Consortium’s decision to make DRM part of HTML5 doesn’t just endanger security researchers, it also endangers the next version of all the video products and services we rely on today: from cable TV to iTunes to Netflix.”
Continue reading

Security risk in “target=_blank”

I’ve often used “target=_blank” in my posts so that people can click on a link without leaving the original page. So do many people. This turns out to be a seriously risky practice, though. When you open a window with an anchor tag specifying “target=_blank”, you give the target window control of the original window’s location object! This means that the target window can modify the content of the original window, possibly redirecting it to a phishing page.

We could also call this a security hole in the HTML DOM, or perhaps in the whole idea of allowing JavaScript in Web pages. I use NoScript with Firefox so that unfamiliar pages won’t run JavaScript, preventing them from exploiting this hole. I can’t expect everybody reading this blog to do that, though. To protect against exploits, I’d need to add “rel=noopener” for some browsers and “rel=”noreferrer” for others. That would require custom JavaScript, which wordpress.com won’t let me do, and would be a lot of work just to modify link behavior. Starting with this post, I’m not using “target=_blank” in my links. The sites I’ve linked to in the past are reputable, as far as I know, so the risk from existing links should be minimal. At least I hope so; supposedly trustworthy websites allow advertisers to include unvetted JavaScript, allowing malware attacks.

Update on my Udemy courses

Udemy has made some serious changes to its pricing rules. This will result in some price changes in my courses, starting on April 4.

In one respect, this is a good thing. Currently, any course participating in Udemy’s marketing programs is periodically subject to huge discounts on zero notice. A $300 course might suddenly be offered for $10. If students enroll in the course through the marketing program, the instructor may get as little as 25% of that. On the other hand, if students enroll using my coupon codes, I get to keep 97% of the money. It’s not hard to see how this can put instructors in a price war against themselves. I want to sell courses through coupons so that Udemy doesn’t gobble up most of the money you pay, but this encourages instructors to set a high price and then discount it heavily so students will use the coupons.

This wasn’t making anybody happy, so Udemy has changed its policies, promising not to discount courses by more than 50%. But this comes with a new set of price restrictions on the courses. All prices have to be between $20 and $50 and — I don’t know why — be a multiple of $5. We can’t give discounts of more than 50% with our own coupons. If a coupon violates this limit, we can’t change it; it will just expire on April 4.

This means I’ll be making the following changes in my prices:

  • Managing metadata with ExifTool: The list price will drop from $36 to $30.
  • Personal digital preservation: The list price will go up from $16 to $20.
  • How to tell a file’s format: Five open source tools: The list price will go down from $28 to $25.

If you’re here, the list prices are irrelevant, since you’ll be buying using the coupon code unless you like spending more and letting me have less. But there are also changes in the coupons. Until April 4, you’ll be able to enroll in the ExifTool course with the code EXIF14 for $14.00. Starting April 4, you’ll have to use the code EXIF15 with a price of $15.00.

The introductory offer for Personal Digital Preservation expired at the end of February. The new code PRESERVE lets you enroll for $11. This won’t change.

The coupon code TOOLKIT for How to Tell a File’s Format: Five Open Source Tools continues to get you a $20 price.

The biggest annoyance is that I like to give students a really deep discount for a course that builds on another one (e.g., on the ExifTool course for those who’ve taken the file identification tools course), and I’ll be limited in what I can do there.

By way of compensation, I’m offering a special rate on Personal Digital Preservation till April 4: Just $8 with the coupon code MARCHAIR! After April 4, you won’t be able to get that low a price for any paid Udemy course.

Hopefully this will all work out well. I’m looking into adding another course, though it’s too soon to give specifics.