HTML5 Encrypted Media Extensions

The Encrypted Media Extensions draft from W3C is drawing controversy. DRM on the Web is traditionally implemented in the service provider, where the content delivery service has full control. But what’s streamed can be captured, and there is software readily available to do it, even if it may violate the DMCA.

An article on Ars Technica reports that Ian Hickson of Google criticized the proposal as both unethical and technically inadequate. Mark Watson, one of the authors of the draft, suggested that strong copy protection can be obtained by building it into hardware, which would mean that only some computers could receive the protected content. Hickson’s email is posted here; unfortunately, it doesn’t expand on what he thinks the problems are.

The draft is intended to accommocate “a wide range of media containers and codecs”; the question is which one or ones will be widely used in practice, and how they’ll be made available, particularly in connection with open-source browsers.

This is a potential area for browser fragmentation.

DROID and JRE 7

According to a post on the DROID mailing list, DROID is not currently compatible with JRE 7. An issue with the Spring framework appears to be the cause. The next release of DROID should support Java 7.

Making sense of MPEG audio formats

Lately I’ve been trying to clarify in my mind exactly how certain common MPEG-related audio formats are defined. I think I’ve got this right, but if anyone can offer corrections, they’d be appreciated.

MP3

This is the common name for MPEG-1 and MPEG-2, Audio Layer 3, which defines an encoding method. The difference between the MPEG-1 and MPEG-2 variants is just in the sampling rates. It’s audio-only. An “MP3 file” is normally a raw MP3 stream without a container.

MP4

MP4 means MPEG-4 Part 14. This is a container format which can hold audio, video, or both. It doesn’t specify the encoding method. In principle you could have an MP3 stream in an MP4 file. The preferred extension for MP4 containers is .mp4, but many others are used to denote specific encodings within MP4 containers.

AAC

This is short for MPEG-2 Part 7, Advanced Audio Coding. MPEG-4, Part 7 defines some extensions of it. That’s the encoding; several different containers may be called “AAC files” if they hold an AAC stream. A raw AAC stream file is possible but not common. MP4 is the most common container, so “MP4 audio” and “AAC” are often treated as if they were synonyms. HE-AAC, also known as aacPlus, is an MPEG-4 audio profile. HE-AAC decoders can decode AAC, but not vice versa.

Apache ODF toolkit

The Apache Software Foundation has made its first release of the ODF Toolkit. This version is called 0.5-incubating, so I imagine it still has rough edges. Officially, “incubating” means that “the project has yet to be fully endorsed by the ASF.”

This could be useful to software that validates or extracts metadata from Open Document Format files. It includes ODFDOM 0.8.7, which has been around for about a year. Anyone want to write a module for JHOVE or JHOVE2?

Possible malware

My filter has been catching spam comments promoting seoplugins dot org (I don’t want WordPress turning that into a link). A web search discloses that their spam has slipped past the filters at quite a number of sites.

Software promoted by spam comments is almost never legitimate and is often malicious. SEO is “search engine optimization” and is a favorite field for unsavory characters preying on site owners’ desperation for more hits. I suggest giving these people a wide berth.

Concerns with Apple’s iBooks Author

Apple’s iBooks textbooks for iPad stakes a position against openness in e-book publishing.

The format of the books is not a standard EPub format. The only tool that can create this format is Apple’s iBooks Author, and the only application that can view it is iBooks. An article on Ars Technica reports that it uses “ePub 2 along with certain HTML5 and JavaScript-based extensions that Apple uses to enable multimedia and interactive features. Those interactive features will only work with Apple’s iBooks app, not with other e-reader software or hardware, because only Apple supports those extensions.”

A post on Glazblog (the author says he’s “Co-chairman of the W3C CSS Working Group”; it would be nice if he gave his name) gives technical details. It uses XML namespaces that aren’t publicly documented, a nonstandard MIME type, and a private CSS extension.

This means you can’t view the books on anything but iOS. If Apple ever drops support for the format, it’s obsolete and impossible to support.

On top of this, the EULA for iBooks Author restricts sale of books created with it to the Apple Store. You can give away your books by any channel you like, but if you sell them, you must use the Apple Store. This means that if Apple doesn’t accept your book for publication, you can’t sell it in that format. (Except maybe in France, as Glazblog amusingly notes.) This is like having a compiler that lets you create software which you may sell only through Petitmol, or a video application that forbids you from selling your movies through anyone but FooTube. I can’t think of a precedent for this.

Authors normally would like to be able to take a book to a different publisher if their previous one loses interest. With books created with iBooks Author, you can’t do that, for both technical and legal reasons. The format isn’t under DRM, though, and the exclusivity applies to the format, not the content. As far as I can tell, you should be able to extract most of the content and republish it in a different format.

Apple’s restrictions make iBooks textbooks unsuitable for assignment to classes, unless the school is willing to give every student an iPad. Those who use other devices would be left out in the cold.

Apart from the restrictions, does Apple’s new format offer anything exciting? My own reaction, from briefly looking at a few sample books on a co-worker’s iPad, is that the interactive graphics are attention-getting, but the most important form of “interactivity” with a textbook is trying things out on your own — playing with the equations, writing sentences in the language, whatever. The best accessory for that is still a pencil and paper.

Correcting Harvard Library rumors

In spite of rumors that have shown up in the #hlth feed on Twitter, no one at the Harvard Library was laid off yesterday, let alone “everybody.” We were told, however, that there will be cutbacks.

We were told that we should all fill out “employee profiles” online to aid in determining what future career we’d have, if any, at Harvard. An official pronouncement quoted in Library Journal has denied that we will all have to “reapply” for our positions, but many of us find the distinction subtle even if it’s technically true.

Take a look at this post for a good summmary.

Further update: Here’s a transcript of yesterday’s presentation at Harvard. There is one significant discrepancy between the transcript and what I and others recall: Helen Shenton did not say at the 9 AM meeting that the deadline for employee profiles was February 29. The deadline was initially earlier — mid-February, I think — and was changed to February 29 by the end of the meeting, following numerous expressions of concern from the audience. (She may have said February 29 at the later meetings.)

PDF/A post on FTL

Today on Files That Last I have a post on “PDF/A for the long haul.” It’s directed at the end user or administrator, not at the formats geek or preservation specialist, but might be useful to link to when you’re explaining what PDF/A is good for.

Article on formats and protocols

Here’s an interesting and thoughtful article on “textuality” in formats and protocols.

Thanks to Andy Jackson’s Twitter feed.

IPRES proceedings

The IPRES proceedings for 2011 are now available.

IPRES 2012 will be in Toronto, making it the most convenient one for Americans in years. It will be September 30 to October 5 (which is when I was planning to be in Germany … just can’t win),